Latest Updates and News from The Genenetwork
2006-05-18: We are experimenting on the Beta site with Scalable Vector Graphics (SVG) displays of scatter plots and other graphs. SVG allows you to modify the display size and area of graphs. You will need an SVG plug-in for your browser and hardware. (Implemented by Jintao Wang.)
2006-05-11: New and final "Eye M430v2 (Apr06) RMA" database has been added to GN beta and production web site. This data set includes data for 71 strains including 55 BXD strains, C57BL/6J, DBA/2J, reciprocal F1 hybrids and 12 other strains of mice. The Info file is still incomplete. Data generated by Weikuan Gu, Eldon Geisert, Yan Jian, Lu Lu, and Rob Williams with support from Barrett Haik and the Hamilton Eye Institute. (Implemented by Hongqiang Li, Yanhua Qu, and Jintao Wang.)
2006-04-26: New and final Mouse BXD Eye mRNA expression database is being added to the Beta site using a new quality control procedure. The data are still being error corrected as of April 29, 2006. This data set includes 57 BXD strains, C57BL/6J, DBA/2J, F1 hybrids and 12 other strains of mice. Data generated by Weikuan Gu, Eldon Geisert, Yan Jian, Lu Lu, and Rob Williams with support from Barrett Haik and the Hamilton Eye Institute. (Implemented by Hongqiang Li, Yanhua Qu, and Jintao Wang.)
2006-04-26: Search menus are now being updated so that they provide a complete list of available databases in hierarchical pull-down menus. (Implemented by Jintao Wang.)
2006-04-18: We have converted our python code to utilize Mod_python. Mod_python is an Apache code module that embeds a Python interpreter within the server and that will often run many times faster than a traditional Common Gateway Interface (CGI). Mod_python will not help much for those processes (e.g., interval mapping or correlation tables) that take a long time to compute. But for fast processes, such as generating AJAX menus, opening data-editing page, it helps substantially. (Implemented by Jintao Wang.)
2006-03-31: A Correlation Results Tables now includes a feature to add multiple columns of correlations. This makes it possible to quickly identify well and poorly conserved correlations across data sets and tissues. You may need to use a newer browser to exploit this new feature. (See News item of March 14th; Implemented by Jintao Wang.)
2006-03-16: NCBI Entrez Gene LinkOut established. LinkOut is a service of NCBI and Entrez that allows you to link directly from PubMed and other Entrez databases to a wide range of information and services beyond the Entrez system. NCBI pages now link from mouse and rat genes to GeneNetwork expression data sets. (Implemented by Hongqiang Li.)
2006-03-14: GENSAT BGEM link to GN established. The Brain Gene Expression Map is a a large library of in situ gene expression images of the embyronic, neonatal, and adult mouse. It includes data for over 3000 genes. (Implemented by Tom Curran and the BGEM group at St Jude Children's Research Hospital.)
2006-03-15: A Correlation Results Tables are now implemented using AJAX code that allows rapid resorting of the top 100, 200, or 500 traits. You will now see small UP and DOWN sort arrows in the column heads. You may need to use a newer browser to exploit this new feature. Being able to resort tables is useful when you would like to filter a list of traits by expression value (usually from high to low) or by position. AJAX is a programming method that makes web pages more responsive and dynamic. (Implemented by Jintao Wang.)
2006-01-20: GeneNetwork's MySQL relational database has been moved to a dual dual-core AMD Opteron 280 computer system assembled by Monarch Computer for improved performance. This system has halved the time required to compute correlation tables from about 100 seconds down to 40 seconds. (Implemented by Jintao Wang.)
2005-12-19: A short Review of GeneNetwork by William R. Lariviere on the American Pain Society web site.
2006-01-03: A GeneWiki system is being implemented. GeneWiki (also known as Gene Notes) allows any user of GN to add notes to the GN database. You can add annotations for genes of interest. All annotation is public. For example, RWW has added annotations on expression patterns of genes in different brain regions using taken from the Allen Brain Atlas and GENSAT. Our first GeneWiki implementation does not conform to all WIKI standards, and it may be more appropriate to consider GeneWiki as a simple system for adding notes on genes. We hope to load GeneWiki with many of the NCBI GeneRifs. (Implemented in progress by Jintao Wang.)
2005-12-19: An AJAX implementation of the Search Page is now being tested on the beta site. There should be almost no noticeable difference if you are using a current version of common web browsers (Explorer, Firefox, Safari). Please contact us if you have any problems. (Implemented by Jintao Wang.)
2005-12-15: GeneWiki feature added to GeneNetwork. You can add short annoations to the GN database that related to genes using an interface we have borrowed from the NCBI Gene Reference into Function (GeneRIF). To read all annotations provided by all users please click on the Annotations button (or GeneRIF button). All annotations are open and public. Annotations should ideally be of use to the research community. Here is an example of a recent annotation entered for the mouse Etv1 gene: "Amygdala and hippocampal CA1 and subiculum expression signature, highly specific neocortical layer 5 expression signature, cerebellar granule cell expression signature (data from Allen Brain Atlas, ABA)."
When adding a note, if possible please provide a PubMed ID number or a web address (URL). You can use the Annotations feature to find groups of genes that belong to interesting functional categories. We are currently using this feature to define sets of "expression signatures" for different parts of the mouse brain, for example genes and transcripts with highly selective expression in the dentate gyrus of the hippocampal formation. (Implemented by Rob Williams and Jintao Wang.)
2005-12-02: New Advanced Search function now allows users to search for either cis-acting or trans-acting QTLs across entire expression data sets. The general fomat is "TransLRS=(Low_LRS_limit, High_LRS_limit, Mb_buffer)". This syntax can be combined in the ALL field with other conditions, such as the chromosome location of the QTL and the expression level of the trait. For a better explanation please see the Advanced Search page. (Implemented by Jintao Wang.)
2005-11-21: Demonstration XML Schema for mouse data sets has been published for the use of the Biomedical Informatics Research Network (BIRN). For readability, please review the source code version of this page. This is an initial demonstration/proof-of-principle. (Implemented by Hongqiang Li.)
2005-11-15: Basic Statistics pages have been improved to handle larger data sets and to provide better graphic output. (Implemented by Jintao Wang and Rob Williams)
2005-11-14: Literature Correlations gene data set by Ramin Homayouni, Michael Berry and colleagues has been updated. The literature correlations are positive values between o and 1 that summarize the pair-wise similarity of genes (or transcripts) on the basis of the known literature using the methods described on the Semantic Gene Organizer site. (Implemented by Ramin Homayouni, Lai Wei, Kevin Heinrich, and Jintao Wang.)
2005-11-01: New Affymetrix M430v2 Eye Data Set for 63 strains of mice (C57BL/6J, DBA/2J, their reciprocal F1 hybrids, 47 BXD recombinant inbred strains, and 12 diverse inbred strains) have been entered on the beta site by the UTHSC Hamilton Eye Institute. Expression data for whole eye is available from Species = Mouse, Group = BXD, and Type = Eeye. The Information (INFO) file that accompanies this M430 data set is still provisional. Use of these data in publications is currently limited to members of the HEIMED consortium pending addition of more data, publication, and formal release, but if you would like permission to make selected use of data please contact Robert W. Williams, UTHSC. (Implemented by Lu Lu, Yan Jiao, Yanhua Qu with support of the Hamilton Eye Institute.)
2005-10-24: New Affymetrix M430v2 Hippocampus Data Set for 96 strains of mice (65 BXD, 13 CXB, and 16 diverse inbred strains, B6D2F1 and D2B6F1) will be placed on the beta site by the Hippocampus Array Consortium at the end of October. Expression data for whole hippocampus will be available from Species = Mouse, Group = BXD, and Type = Hippocampus. The Information (INFO) file that accompanies this M430 data set is still provisional. Use of these data in publications is currently limited to members of the consortium pending data addition, publication, and formal release, but if you would like permission to make selected use of data please contact Robert W. Williams, UTHSC. (Implemented by Lu Lu, Shirlean Goodwin, Yanhua Qu, Rob Williams, and members of the Hippocampus Consortium.)
2005-10-10: New Affymetrix M430v2 Striatum Data Set for a B6D2F2 Intercross has been placed on the beta test site by Robert Hitzemann and colleagues. Expression data for the striatum of 30 males and 30 females are available from Species = Mouse and Group = BDF2-2005. The Information (INFO) file that accompanies the M430 data is still provisional. For use of these unpublished data please contact Robert Hitzemann, Department of Behavioral Neuroscience, Oregon Health & Science University. (Implemented by Yanhua Qu.)
2005-10-07: Advanced Search options have been improved. The main improvement involves combining Gene Ontology searches with other advanced search syntax. (Implemented by Hongqiang Li.)
2005-09-28: GeneNetwork Mouse SNP Browser has been upgraded with Perlegen/NIEHS data. The SNP Browser is a tool that is used in combination with the Interval Analyst to evaluate and rank genes and polymorphisms in intervals thought to be responsible for variation in traits. The SNP Browser includes all Celera Genomics mouse SNPs, all public mouse SNPs in dbSNP (as of August 2005), and all Perlegen-NIEHS SNPs (http://mouse.perlegen.com/mouse/download.html as of late Sept 26, 2005). We thank Paul Thomas and Richard Mural of Celera Genomics, Gary Churchill and Natalie Blade of the Jackson Laboratory, and the Perlegen/NIEHS sequencing consortium for help and access to data. (Implemented by Robert Crowell, Alex Williams, and Jintao Wang.)
An example: To search for SNPs type in this string and then modify position as desired:
2005-09-27: Gene Ontology searching is now possible. This search feature allows you to search for all genes/transcripts related to particular categories using the appropriate GO identifer. For example, to extract all transcripts associated with "synaptic vesicle exocytosis" enter the string "GO:0016079" in the ANY field. To browse GO terms and classes link to AmiGo. As of Sept 2005, the GO contains approximately 20,000 terms of which approximately 6300 GO terms can be associated with genes in one or more of the GeneNetwork databases. Approximately 700 high level GO terms will return well over 200 genes. Given the 500 transcript limit it is therefore useful to select lower level GO terms that will return 100 or fewer probe sets/transcripts/genes. (Implemented by Hongqiang Li.)
2005-09-20: The UCSC Gene Browser is now linked to GeneNetwork from the Gene Description and Page Index as a "Quick Link" for both mouse and rat genomes. (Implemented by Jintao Wang at UT and Fan Hsu at UCSC.)
2005-09-06: Phenotype Data Entry SOP. We are beginning to develop standard operating procedures (SOP) to allow colleagues to deposit new data sets into the GeneNetwork. Please review this initial Phenotype data entry SOP if you have traits that you would like added to either an existing or new mapping panel (Partially implemented by Rob Williams.)
2005-08-26: OHSU/VA B6D2F2 Brain mRNA 430 (Aug05) MAS5, RMA and PDNN array data sets now are available. These data sets include M430 Set A and Set B arrays (Implemented by Yanhua Qu.)
2005-08-19: GenomeGraph has been implemented for several large array data sets and can now be used for testing purposes. GenomeGraph is a new module of The GeneNetwork that is designed for the analysis of entire array data sets. (Implemented by Jintao Wang.)
2005-08-17: Dynamic GeneNetwork Database Schema Description allows database experts to review the data structure and fields used by the GeneNetwork MySQL relational database. We have just begun the textual annotation of the database tables and field. This new system will soon replace the current database "dump" available at http://www.genenetwork.org/schema.html (Implemented by Hongqiang Li.)
2005-08-16: Traits in the Selections Windows Now Sortable. The Selection command is used to move trait data from one or more databases into a single Selections window (aka the "shopping cart") for common analysis. For example, users can put classical phenotypes such as body and brain weight in the same Selections window with transcripts for growth hormone receptor (Ghr), GH releasing hormone (Ghrh), and GHRH receptor (Ghrhr) in liver and brain. The new feature makes it possible to sort items in the Selections window by database, position, or name. Sorting is helpful is reviewing contents of the window and in reordering items prior to calculating correlation matrices. Please recall that all itmes in a Selections window must come from a single genetic reference population or panel, for example the AXB/BXA strains of mice, the BXH strains of rat, or from one of several intercrosses. (Implemented by Jintao Wang.)
2005-08-12: New Mouse Liver and Metabolic Trait Databases have been released by Dr. Alan Attie and colleagues. While these data may be reviewed, their use is still are reserved until final publication. The primary database is an Affymetrix M430 survey of gene expression in the liver of 60 selected F2 mice (a B6 x BTBR F2-ob/ob cross) that includes data on approximately 45,000 probe sets. This array database is accompanied by 24 classical metabolic and blood chemistry traits. All F2 animals were genotyped a 194 microsatellite markers. (Implemented by Alan Attie and colleagues, Yanhua Qu, and Jintao Wang.)
2005-08-08: The Interval Analyst (IA) provides a tabular summary of known genes in a chromosomal interval with data on gene expression, gene size, SNPs number and density, and human homologs. The IA is still a beta site function but will be release to the public site in the next week. The IA table is automatically generated with each chromosome map. IA tables can be extensively customized and resorted. For the BXD and AXB/BXA mouse genetic reference panels, the IA also provides access to Celera SNPs, as well as public SNPs for a variety of sources. Clicking on the SNP number for a specific gene in the IA generates a SNP browser table (at present, only for mouse). The purpose of the IA is to allow users to rank-order genes in an interval that may be contributing to variability in phenotypes. (Implemented by Evan Williams, Robert Crowell, Alex Williams, and Rob Williams.)
2005-08-08: The design of Chromosome and Whole Genome QTL Maps has been signficantly improved and updated. These new physical QTL maps merge LRS or LOD functions with gene and SNP tracks and can be zoomed to the level of single genes and SNPs. Maps can be exported in 2X versions that are near publication quality. Below most maps you will now find a customizable Interval Analyst table that can be customized to help rank order candidate genes. Variants of these new maps have been introduced to handle all species and genetic reference populations. (Implemented by Robert Crowell, Alex Williams, Evan Williams, and Rob Williams; final integration by Jintao Wang.)
Legend: Sample of a new high resolution physical map. This map shows a locus that modulates the expression of the Cart transcript (cocaine and amphetamine regulated transcript) on distal Chr 10 in BXD mouse strains (brain tissue). The Control Block, top middle, permits users to customize the display and its resolution. Pink, blue, and beige horizontal bars above the map provide links to higher resolution maps (8x) or to the UCSC and ENSEMBL genome browsers. Statistical thresholds for linkage are marked by grey and pink horizontal lines and are based on 2,000 permutations. The Y-axis provides a scale for the plot of LRS or LOD scores that are plotted using a thicker blue line. The calculation of linkage statistics are based on a total of 147 useful markers that have been genotyped in all 89 BXD strains (The Wellcome-CTC Mouse Strain SNP database with added microsatellite markers). The far more digital look of the LRS function that traditional interval maps arises for the simple reason that locations of recombinations in this cross have been precisely defined and only a fewer regions exploit a true interval mapping approach (see News item of 2005-06-17 for additional detail).
The thinner red and green lines and the right Y-axis display the additive effect size; green for high alleles inherited from one parent (DBA/2J in this example), and red for high alleles from the other parent (C57BL/6J). The units are log2 expression differences where 0.2 is equivalent to a 2^0.2-fold difference. The large number of closely packed tick marks along the top of the map show locations of genes on Chr 10. Gene blocks are color coded by the average density of SNPs per gene using a rainbow color sequence with low density in the blue/green spectrum and high density in orange/red spectrum. The bright orange hash marks along the X-axis provide a graphic estimate of numbers of SNPs that are segregating in the BXD strains in any particular chromosomal region. A long interval from 30 Mb to 65 Mb is almost identical by descent between the two parental strains.
Many regions of these maps are responsive to a mouse click. For example, the name and size of any gene can be determined by simply placing the mouse cursor over its mark. The same applies to the significance thresholds and the SNP track. Below each of these maps is a complete list of known genes in the interval with numerous links to other data types, including information on expression, lists of known SNPs in each gene, and corresponding regions of the human genome. All physical map positions in mouse are based on the Mouse Build 34, mm6 (March 2005).
2005-08-02: An Export Traits function button has been added to the set of tools available in each Selections Window (the Selections window is known informally as the "shopping cart"). Export Traits now joins other tools such as Cluster Tree, Network Graph, and Compare Correlates at both the top and bottom or each Selection window. Any set of traits in the Selection window can be easily exported, including conventional phenotypes, genotypes, and subsets of array data. The default output format is compatible with Microsoft Excel. (Implement by Jintao Wang.)
2005-07-29: Rat RAE 230A and Mouse (M430 and U74A) Affymetrix Probe Set Annotation Tables have been significantly improved and realigned to rat and mouse genome assemblies. Information taken from the BLAT alignment data has been added to GeneNetwork data tables. Data types include the alignment score of concatenated probes, probe set specificty (usually the ratio of first hit score divided by second hit score), a position values of the 3' and 5' ends of the concatenated probe sequences. [Implemented by Senhua Yu (rat) and Yanhua Qu (mouse).]
2005-07-27: All Mouse Genotype Databases have now been fully updated using Wellcome-Illumina-CTC SNP data sets consisting of 13377 SNPs. These SNPs have been integrated with the older microsatellite markers used through July 2005. You can search for markers (see Advanced Search) and treat genotypes as a standard "trait." You can also align the sequence of any marker to the latest genome assembly to determine where a SNP or microsatellite is located. (Implemented by Jing Gu, Lu Lu, and Jintao Wang.)
2005-07-26: Complete Upgrade of the PUBLISHED PHENOTYPE Databases. All PubMed abstracts were searched in June and July of 2005 for publications pertaining to BXD, AXB, CXB, or BXH mouse recombinant inbred strains. Means and standard errors were collected, reviewed, and extracted from these papers. Data were then entered manually in GeneNetwork tables by Emily English and Elissa Chesler.
2005-07-26: Sorting Traits by several different variables is now possible in the Search Results page. Select from seven different ways to sort lists as shown in the screen shot below.(Implemented by Jintao Wang)
2005-07-26: QTL Reaper tutorial has been added to the GeneNetwork site. QTL Reaper is a command line program for high throughput mapping of array data sets. (Implemented by Evan Williams.)
2005-07-25: An Error Detected and Corrected in SJUT Cerebellum databases dated March 2005. Data for BXD23 mistakenly included a BXD14 sample. All three March 2005 databases (RMA, PDNN, MAS5) have now been corrected. Values for the two affected strains are changed relative to data in this database prior to July 25, 2005. (Implemented by Jing Gu, Rob Williams, and Yanhua Qu.)
2005-07-22: Modified Linux Virtual Server configuration to eliminate problems with client institution firewall restrictions on numbers of simultaneous connections. Our thanks to Dr. Michael Miles for his help diagnosing firewall problems for clients. (Implemented by Jintao Wang.)
2005-07-21: Improved Advanced Search. It is now possible to combine search strings to generate complex queries. For example, this combination Mb=(Chr11 90 100) Mean=(12 20) when entered into the lower ALL field will find transcripts that map to Chr 11 between 90 and 100 Mb that also have mean expression between 12 and 20 units. (Implemented by Jintao Wang.)
2005-07-15: . GeneNetwork Mouse SNP Browser has been implemented. The SNP Browser is a tool that will eventually be used in combination with the Interval Analyst to evaluate and rank genes and polymorphisms in intervals thought to be responsible for variation in traits. The SNP Browser includes all Celera Genomics mouse SNPs, all public mouse SNPs in dbSNP (as of August 2005), and all Perlegen-NIEHS SNPs (http://mouse.perlegen.com/mouse/download.html as of late June 2005). The SNP Browser is still at an early stage of development. We thank Paul Thomas and Richard Mural of Celera Genomics, Gary Churchill and Natalie Blade of the Jackson Laboratory, and the Perlegen/NIEHS sequencing consortium for help and access to data. (Implemented by Robert Crowell, Alex Williams, and Jintao Wang.)
An example: To search for SNPs on Chr 5 from X to Y Mb:
2005-07-15: Access to GeneNetwork Archive site. The archive site provides access to old data sets and old genotype files that have now been superceded. We anticipate that it will be used mostly to verify old findings and to document changes in results. The Archive is now available from the main search page. (Implemented by Jintao Wang.)
2005-07-13: LXS Genotypes Upgraded. Genotypes for the large GRP of LXS strains has been greatly improved thanks to the Illumina-Wellcome-CTC SNP project. The original set of 330 markers has been replaced with a set of 2659 informative markers. Download either the LXS genotypes or BXD genotypes used by WebQTL as text files.
(Implemented by Jing Gu and Jintao Wang.)
2005-07-12: Search Page Upgraded. Users now can change the default settings to those they most commonly use. Your browser must be configured to allow The GeneNetwork to retain a "cookie" on your computer. We have also added a new button labeled ADVANCED SEARCH that provides advice and syntax for searches. (Implemented by Jintao Wang.)
2005-07-12: Pair-Scan Upgraded. The pair scan now exploits the new Wellcome-Illumina high density genotype files. This result in more exhaustive searches for two-locus interactions. This is particulary true when single chromosome pairs are scanned by clicking on the initial DIRECT output graph. (Implemented by Jintao Wang.)
2005-07-12: Updated Affymetrix M430 GeneChip Annotation Data. We have realigned all M430 probes and probe set sequences onto the latest mouse assembly (Build 34 or mm6). This annotation is more complete than most other available M430 probe set annotation of which we are aware, including Affymetrix NetFX. (Implemented by Yanhua Qu.)
2005-06-17: New High Density Mapping Algorithm that exploits the CTC-Wellcome SNP data has been implemented for the BXD mouse genetic reference populations on both public and beta sites. In the case of the BXD panel (BXD1 through BXD100), the merged SNP and microsatellite maps are based on a total of 7636 informative markers that differ betweeen the parental strains, C57BL/6J (B) and DBA/2J (D). The locations of these makers are known on the latest assembly of the mouse genome (Build 34, mm6). The median distance between markers in this subset is 178,831 bp. The mean distance is 324,493 bp. There are only 26 intervals between markers that are longer than 5 Mb. No interval is greater than 10 Mb except on Chr X. These long intervals are essentially monomorphic between the parental strains.
The new algorithm exploits a selected subset of 3795 markers that includes all markers with unique strain distribution patterns (SDP) as well as pairs of markers (the most proximal and most distal markers) for SDPs represented by two or more markers. This BXD genotype data set can be downloaded by ftp at ftp://atlas.utmem.edu/public/BXD_WebQTL_Genotypes_June05.txt.
The mapping algorithm is a mixture of simple marker regression, linear interpolation, and standard Haley-Knott interval mapping. If two adjacent markers have identical SDPs, they will have identical linkage statistics, as will the entire interval between the markers (assuming complete and error-free haplotype data for all strains). On a physical map the LRS and the additive effect values will therefore be constant over this interval. Between neighboring markers that are separated by 1 cM or more we use a conventional interval mapping method (Haley-Knott) combined with a Haldane estimate of genetic distance. When the interval is less than 1 cM we simply interpolate linearly based on a physical scale between the markers. The result of this mixture mapping algorithm is a map of the trait that has an unusal profile that is particular striking on a physical (Mb) scale, with many plateaus, abrupt linear transitions between plateaus, and a few regions with the standard graceful curves typical of interval maps.
The same procedure will soon be implemented for other mouse GRPs, including AXB/BXA, CXB, BXH, and AKXD.
For users that would like reference access to the old set of genotypes, we will set up an Archive site with the May 2005 microsatellite markers and maps.
To download the combined SNP and microsatellite genotype file used in WebQTL please link to ftp://atlas.utmem.edu/public/ and look for Illumina_UT_BXD_May05.xls (entire data set) or BXD_WebQTL_Genotypes_June05.txt (extracted subset of markers used by WebQTL), or link to Dr. Richard Mott's Mouse Inbred Line Genotype site for the original SNP data set. (Implemented by RW Williams, KF Manly, and JT Wang.)
2005-06-13: Rat HXB Fat Data Set released on the www.genenetwork.org/search3.html test site (stabilized RMA transform). The Affymetrix RAE230A data files generated by Tim Aitman and colleagues were downloaded from the Array Express site. The set of 120+ arrays covers a total of 30 RI strains and complements a recent paper (Hübner et al., 2005). Error checking is still in progress and this is a pre-release data set to use for test purposes. (Implemented by Senhua Yu, R. Williams, and Jintao Wang. More transforms are in progress.)
2005-06-12: Moved GeneNetwork and Upgraded Utilities. The GeneNetwork and the WebQTL module has been moved to a cluster of nine P4 single processor computers. Eight of the nodes are devoted to the GeneNetwork application code while the ninth node runs the Linux virtual server. The MySQL database server currently runs on a separate Proliant dual processor node. The Roundup issue tracking systems has been upgraded to v. 0.83 and is now available at http://www.genenetwork.org:8080/webqtl/. Analog has also been upgraded to v 6.0. (Implemented by Jintao Wang, with thanks again to Ari Berman.)
2005-05-24: Ultra-high Resolution Mouse SNP Genetic Maps are now gradually replacing the previous generation of microsatellite maps. Until May 2005, all genetic maps of recombinant inbred strains of mice in WebQTL have relied heavily on a set of roughly 1500 microsatellite markers genotyped across all RI sets by the Informatics Center for Mouse Neurogenetics (Williams et al., 2001; Peirce, Lu et al, 2004). In collaboration with members of the CTC (Richard Mott, Jonathan Flint and colleagues), we have helped genotype a total of 480 strains using a panel of 13,377 SNPs. More than half of the SNPs are informative in most crosses. These SNPs have been combined with microsatellites to produce new consensus maps for BXD and other GRPs using the latest mouse genome assembly as a reference frame (Build 34 - mm6). In the case of the BXD GRP, a total of 88 strains were genotyped using the full set of SNPs of which 7482 are informative. The order of markers given in WebQTL is essentially the same as that given in Build 34. To reduce false positive errors when mapping using this ultradense map, we have eliminated most single genotypes that generate double-recombinant haplotypes. Double-recombinant haplotypes are most commonly produced by typing errors ("smoothed" genotypes). (Implemented by Lu Lu, Jing Gu, Jintao Wang, Ken Manly, and Rob Williams, with help from Jonathan Flint and Richard Mott).
2005-05-23: Search Functions have been upgraded. It is now possible to (1) find all transcripts whose genes map to a give chromosomal location; (2) all traits and transcripts that have a mean value within a particular range; (3) all traits that have a peak genome-wide linkage score (LRS score or p value) within a particular range. These new search functions are still being tested on the test site (http://www.genenetwork.org/search3.html). (Implemented by Jintao Wang).
(1) To find transcripts by chromosomal position the search syntax needs to follow these rules:
- "Position in (ChrY 0.3 52.4)" or "Position = (Chr1, 98 104)" [Note: No space between "Chr" and the number or letter of the chromosome. ]
- "Pos in (ChrY 0.3 52.4)" or "Pos =(Chr1, 98 104)" [don't enter the quotes.]
- "Mb in (ChrY 0.3 52.4)" or "Mb = (Chr1, 98 104)" [don't enter the quotes.]
(2) To find traits by mean value, the search syntax needs to follow these rules:
- "Mean in (12.3, 12.4)" or Mean=(12.3, 12.4) [These strings will find those traits with a mean value from 12.3 and 12.4. Don't enter the quotes.]
(3) To find traits by LRS value or p value, the search syntax needs to follow these rules:
- "LRS in (20, 30)" or "LRS=(20, 30)" [These strings will find traits with LRS values ranging from 20 to 30. This search depends on the existence of database of precomputed LRS values. If this database has not yet been set up for a particular data set, then the search will not return any records. Don't enter the quotes.]
- "pvalue in (0.0001, 0.001)" or "pvalue=(0.0001, 0.001)" [These strings will find traits with p values ranging from 0.0001 to 0.001. This search depends on a database of precomputed values. If this database has not yet been set up for a particular data set, then the search will not return any records. Don't enter the quotes.]
2005-05-13: Virtual Server implementation of The GeneNetwork is being beta tested. The Linux Virtual Server (LVS) allows GeneNetwork to exploit a small clusters of servers to handle larger numbers of clients quicky. Performance is particularly critical during bioinformatics class projects when large numbers of students make nearly simultaneous requests. (Implemented by Jintao Wang, Senhua Yu, and Ari Berman).
2005-05-12: Genome Explorations Inc. has been provided a license to run a copy of the GeneNetwork and WebQTL software as part of a Phase I Small Business Innovation Research (SBIR) grant from NIAAA. The TCP/IP address is 18.104.22.168. The site currently contains three data sets (MAS5, RMA, and PDNN) generated at GE and UTHSC (subcontractor) using a total of 85 Affymetrix M430 2.0 arrays. The first data release consists of 26 BXD strains, the two parental strains, C57BL/6J and DBA/2J, and ten other inbred strains of mice (A/J, 129S1/SvJ, AKR/J, BALB/cJ, BALB/cByJ, C3H/HeJ, CAST/Ei, KK/HIJ, LG/J, and NOD/J). (Implemented by Jintao Wang, Yanhua Qu, Lu Lu, Roberrt Williams, Robert Rooney, and Divyen Patel).
2005-05-10: Whole Transcriptome Mapping Display: We are testing an interface that displays a entire transcriptome QTL map for a tissue similar to figures 3A and 3B of Chesler and colleagues (2005). Note that one parameter can be used to modify the false discovery rate of the points that are plotted. Plots have been precomputed for more than 30 databases and transforms. (Implemented by Jintao Wang).
2005-05-04: New Mouse Genome Assembly (NCBI Build 34, UCSC mm6) released by NCBI (implemented by Deanna Church and colleagues). Over the next several months all mouse genome megabase and nucleotide position data and links in the GeneNetwork (markers, probes, SNPs, genes) will be converted to this new assembly. BLAT searches initiated with WebQTL already exploit the most recent build. GeneNetwork users may find small discrepancies in gene and marker locations until all database tables are updated.
2005-04-22: Arabidopsis Data Sets released on the www.genenetwork.org/search3.html test site. The Genotypes and Phenotypes files for the Bay-0 x Shahdara cross data were all provided by Olivier Loudet. Please see the Information file. Implemented by O. Loudet, R. Williams, and Jintao Wang.
2005-04-21: Rat HXB Kidney Data Set released on the www.genenetwork.org/search3.html test site (original RMA transforms). The Affymetrix RAE230A data files were provided by Norbert Hübner and colleagues. The set of 120+ arrays covers a total of 30 RI strains and complements a recent paper (Hübner et al., 2005). Implemented by Senhua Yu, R. Williams, and Jintao Wang. More transforms are in progress (MAS5 added May 13, 2005).
2005-04-14: New S-Score Transform for the BXD Brain data set released on the www.genenetwork.org/search3.html test site. This data set complements existing MAS5, PDNN, RMA, dCHIP, and HWTIPM transforms. The Significance score method centers the expression of every probe set at 0. The signal values are therefore the strain deviations in Z score units from the grand mean based on 100 arrays. The S-score software is described in Zhang et al. (2002) and Kerns et al. (2003).
2005-04-08: Expanded HBP/Rosen Striatum Data Sets released on the www.genenetwork.org/search3.html test site (MAS5, RMA, and PDNN transforms). The new data set covers a total of 33 strains using 59 M430 2.0 arrays. A good demonstration of the improved performance of the expanded data set is Kcnj9 (probe set 1450712_at_A), a known cis-QTL in multiple data sets. This trait generates a peak LRS score of 27.0 in the initial November 2004 data set (MAS5) and a peak LRS of 47.8 in the April 2005 data set (MAS5). The peak LRS is approximately 600 Kb proximal to the Kcnj9 gene. The Heritability Weight Transform (HWT) data set will be added in the next several weeks.
2005-04-04: Expanded INIA Brain Data Sets released on the www.genenetwork.org/search3.html test site (MAS5, RMA, and PDNN transforms). Seventy-one new samples have been added, bringing the total to 105 arrays covering 42 BXD strains, both parents, and the F1 hybrid. A good demonstration of the improved performance of the expanded data set is Kcnj9 (probe set 1450712_at_A), a known cis-QTL in multiple data sets. This trait generates a peak LRS score of 14 in the initial October 2004 data set (MAS5) and a peak LRS of 41.9 in the April 2005 data set (MAS5). The peak LRS is approximately 2000 Kb distal to the Kcnj9 gene. We have also tested these data using probe set 1418908_at_A (Pam). This trait generates a peak LRS score of 52.8 in the initial October 2004 data (MAS5) and a peak LRS of 54.2 in the April 2005 (MAS5). The peak LRS is approximately 800 Kb distal to Pam gene. The Heritability Weight Transform (HWT) transform will be added in the next several weeks.
2005-03-21: Expanded Cerebellum Data Sets released on the www.genenetwork.org/search3.html test site (MAS5, RMA, and PDNN transforms). Fifty-four new samples have been added. We have tested these data using probe set 1418908_at_A (Pam), a known cis-QTL in multiple data sets. This trait generates a peak LRS score of 31.7 in the initial March 2003 data (MAS5), a peak LRS score of 32.3 in the October 04 data (MAS5), and a peak LRS of 52.2 in the March 2005 (MAS5). In the March 2005 data, the peak LRS is only 500 Kb from the 5' promoter region of the Pam gene. The abundantly expressed GABA alpha 6 receptor (Gabra6) transcript (1417121_at_A) is another good test case of a cis modulated trait in cerebellum. (Implemented by the GeneNetwork group and the Cerebellum Consortium). The Heritability Weight Transform (HWT) data set will be added in the next several weeks.
2005-03-15: Cluster Trees now compute and display up to 100 traits simultaneously. This makes it possible to select the top 100 covariates of a trait from a Correlation Results table and map all 100 as a hierarchically organized group. (Implementation by Jintao Wang).
2005-03-04: Literature Correlation data set has been integrated into GeneNetwork Correlation Results output tables. This important new feature provides an estimate of the strength of relations between pairs of genes that is based on a textual analysis of PubMed abstracts (latent semantic index correlations). Values are based on a matrix of 16,000 gene-gene simlarity scores computed by Ramin Homayouni (UTHSC) and Michael Berry (UT Knoxville). This feature is still experimental, and GeneNetwork users should note that pairs of genes that are mentioned together in a small set of papers may have inappropriately high correlations. For more information on the algorithm please contact Ramin Homayouni. (Implementation by Ramin Homayouni and Jintao Wang).
2005-03-01: Network Graph output has been improved significantly. It is now possible to change the labels from probe set IDs to gene symbols. Nodes can also be color-coded by database. Markers and genotypes can be used as nodes. Literature Correlations can be used to define the lines (edges) between traits. (Implementation by Jintao Wang).
2005-03-01: Heritability Weighted Transform method has been published at Genome Biology. This method (HWT1PM) provides significantly higher signal than other common transforms. (Design and implemenation by Ken Manly)
2005-02-23: Database Schema has been published online at http://www.genenetwork.org/schema.html. This schema (January 2005 version) was generated using MySQLdump v 9.1. (Implemenation by Jintao Wang, Bill Bug, and Ken Manly)
2005-02-23: Scriptable Interface improved to handle queries from Genome Browser and other systems. The new interface provides a list of links to data from multiple tissues and strains for a single gene. For example, to retrieve expression estimates for Kcnj8 the URL query has this form: http://www.genenetwork.org/cgi-bin/beta/WebQTL.py?cmd=search&gene=kcnj8. This query does not resolve the many possible aliases for gene symbols, and requires the use of the preferred or official gene symbol. (RWW, implementation by Jintao Wang)
2005-01-27: QTL Reaper 1.0.0 has been released. QTL Reaper is platform-independent program for rapidly mapping thousands of traits. It is now available to advanced users at SourceForge (241 KB, written in Python and C with sample and help files). QTL Reaper can map well over 50,000 traits in under 12 hours on fast single-processor systems. It includes a sophisticated method (Besage, 1991) to adjust the number of permutation tests to estimate genome-wide p values with reasonable precision down to values of approximately 10^-5 (10^6 permutations). This feature is useful for identifying reproducible QTLs in large transcriptome data sets, that is, sets of QTLs with defined false discovery rates. (Design by Ken Manly, implemenation by Jintao Wang)
Besag J, and Clifford P (1991). Sequential Monte Carlo p-values. Biometrika 78: 301-304.
2005-01-26: The Pair-scan output tables now include a new analytic tool that provides a breakdown of strains in each genotype category (for example, the four two-locus genotypes: B/B, B/D, D/B, and D/D) either in the form of scatter plots or in the form of a box plot. This new feature is still being tested and refined and is currently available only on the test site (www.genenetwork.org/search3.html). This feature will be moved to the public site in February. (Implemenation by Jintao Wang)
2005-01-22: Marker Genotype Databases have been added that complement trait and transcriptome databases for the following groups: AKXD, AXB/BXA, CXB, BXH, BXD, LXS, B6D2F2, and the rat HXB/BXH. These new databases enable you to use any marker genotype as a "trait" to search for transcripts or classical phenotypes that may be influenced by particular genomic regions. This is now possible using the new Genotype databases and the Compare Correlates tool. To find all markers on Chromosome 1 just type in "Chr 1" or "Chromosome 1" into the Search field. These maker genotype databases are currently available on the test site (www.genenetwork.org/search3.html) but will be moved to the public site by late January. (Implemenation by Jing Gu, Lu Lu, Yanhua Qu, Rob Williams, and Jintao Wang)
2005-01-21: New Data Download feature has been added. The Information files for most UTHSC Brain databases (e.g., the RMA Orig transform) now have links to Excel workbooks that include the full Affymetrix U74Av2 data set of 100 arrays for each transform. These Excel workbooks also include a separate spreadsheet with the strain averages for each transform. Look for the word "Download" in the Information pages. (Implemenation by Yanhua Qu)
2005-01-13: We have added a new BLAST probe analysis tool to the Probe Information tables associaed with each Affymetrix probe set. This button-tool aligns any PM 25-mer probe to the GenBank sequence that Affymetrix lists as being the sequence source. When BLAT analysis of concatenated probes does not provide an unequivocal map location for a probe set, this method can be used to verify that the GenBank accession is correct. If so, it may then be appropriate to BLAT the entire GenBank entry to verify probe set map location. (Implemenation by Yanhua Qu)
2005-01-11: Rat HXB/BXH Published Phenotype databases added to the GeneNetwork. The genetic maps that are used in combination with these phenotypes are based on a total of 770 markers. Phenotypes were all provided by Michal Pravenec. We thank Tim Aitman and Pierre Mormede for review of their data sets. (Implementation by RWW, MP, and JW)
2005-01-03: We now provide links to entire data files for the U74Av2 brain data set. All DAT, CEL, TXT, RPT, and EXP files can be downloaded. For example, here are data files for five C57BL/6J U74Av2 arrays. The complete U74Av2 data set consists of a total of 100 arrays, all of which can be reached from the Main Table in any of the Information Pages for these different transforms (MAS5, RMA, PDNN, HWT1PM, dChip). The DAT, CEL, RPT and EXP files will be identical among all transforms. The only differences among transforms are the TXT files. The appropriate reference to cite if you make use of these data files is:
Chesler EJ, Lu L, Shou S, Qu Y, Gu J, Wang J, Hsu HC, Mountz JD, Baldwin N, Langston MA, Threadgill DW, Manly KF, Williams RW (2005) Genetic dissection of gene expression reveals polygenic and pleiotropic networks modulating brain structure and function. Nature Genetics 37: 233-42.
2004-12-25: First draft of the WebQTL Glossary is completed. Many key terms are now defined. We will be adding links to the glossary from graphs and other pages.
2004-12-22: An annotated Links list has been added. Email RW Williams at if you have suggestions for additional sites that have proved useful in combination with GeneNetwork resources.
2004-12-21: We have implemented a new method of transforming Affymetrix microarray data called the Heritability Weighted Transform (Manly et al. 2005). When used with large Affymetrix data sets of the type used by WebQTL, this method is considerably more powerful than other common probes-to-probeset transform such as MAS5, PDNN, RMA, or dChip. To evaluate this new method please try the Mouse/BXD/Brain/Database called UTHSC Brain mRNA U74Av2 (Dec03) HWT1PM (HWT1PM is short for Heritability Weighted Transform Version 1, Perfect Match Probes only). For further detals on this method see the Info page. The reference for this approach to transforming Affymetrix array data is:
Manly KF, Wang J, Williams RW (2005) Weighting by heritability for detection of quantitative trait loci with microarray estimates of gene expression. Genome Biology 6: R27.
2004-12-17: We have added mouse UniGene identifiers from Build 142. It is therefore now possible to enter search terms such as "Mm.1" to find data on S100 calcium binding protein A10 (S100a10). A total of 38,034 probe sets on the Affymetrix mouse expression array 430 2.0, have UniGene identifiers.
2004-12-14: First draft of the WebQTL Frequently Asked Questions is completed. We be happy to answer any other questions you have. Please email RW Williams at .
2004-12-13: Major additions are expected later in December in both the SJUT Cerebellum data set and in the INIA Brain data set. Sample size will be almost doubled in both data sets.
2004-12-10: Updated positions of Mouse Expression Aglient G4121A probe using the May 2004 (mm5) assembly of the mouse genome. This work was carried out by Yanhua Qu.
2004-12-10: We have begun to combine WebQTL and The GeneNetwork. WebQTL is the first and so far only "channel" of the GeneNetwork. However, our hope is that there will soon be other projects that will share use of the GeneNetwork. The main URL is now www.genenetwork.org. Requests to www.webqtl.org will resolve to www.genenetwork.org.
2004-12-03: Rat HXB/BXH genotype and published phenotype databases added to beta test site of WebQTL. The genetic maps are based on a total of 770 markers. Phenotypes were all provided by Dr. Michal Pravenec.
2004-12-02: Important new graphic and analytic tools have been added.
The first of these is the Compare Correlates tool. This function is available in Selection Windows. It is essentially a Venn diagram set tool. Instead of providing simple graphs, it provides lists of traits in different parts of a virtual Venn diagram. For example, to find traits that covary with Sonic Hedgehog, Indian Hedgehog, Desert Hedgehog, Patched1, and Gli3, you would select five key transcripts into a Selections window (use the "Add Selection" tool and then select a group of traits in the Selections window). Compare Correlates allows you to chose the target database to which the key traits will be correlated. Compare Correlates was designed by Elissa Chesler and Stephen Pitts. Code was written and optimized by Stephen Pitts.
The second new tool is Network Graph. This function displays a set of traits and their correlations in the form of a graph with nodes (traits) and lines (correlations). There are quite a few tunable parameters, including the correlation threshold used to draw (or not draw) a line between nodes. To use this new tool, you again need to have traits loaded into one of the Selections windows. Network Graph was designed by Elissa Chesler and Stephen Pitts. Code was written, optimized, and error-checked by Stephen Pitts.
2004-10-24: Updated positions of all Mouse Expression U74Av2, 430A, 430B, and 430 2.0 probe sets using the May 2004 (mm5) assembly of the mouse genome. This work was carried out by Yanhua Qu. The M430 data consists of 45,000 probe sets. Positions were obtained using a series of methods: Method 1. A BLAT analysis of the actual probe sequence using a 48-processor cluster (our thanks to Yan Cui). Roughly 90% of all probe sets were mapped using this method. If the probe sequence did not BLAT with a score above 99 AND an identity match of 100, then we used Method 2: We used the position of the probe set given in the affMOE430.txt.gz data file. This method recovered position data for approximaely 5% of all probe sets. If Method 2 failed, then we used Method 3: We obtained the position given by Affymetrix in the files called "MOE430A Annotations, CSV (6.3 Mb, 10/12/04)" and "MOE430B Annotations, CSV (3.9 Mb, 10/12/04)". This method recovered positions on roughly 4%. As a last resort we used Method 4: We retained position data from mm4 or mm3 without interpolation. No position data would be found for 198 records and no chromosome could be found for 46 probe sets. We estimate that 5 to 10% of position data are unreliable.
2004-10-16: Expression data set for the striatum of BXD strains released by Glenn Rosen to the www.webqtl.org/search3.html beta site. This is the first WebQTL database that exploits the Mouse Expression 430 2.0 array from Affymetrix. Four versions were released: MAS5, RMA, PDNN, and the new GCRMA.
2004-10-11: New hierarchical Search Page interface released to main site (Choose species, cross, type, and database). New Info pages released. More complete annotation and explanation of the use of the pair-scan data is now provided when the "permutation" option is selected in the Analysis Tools area of the Trait Data and Editing Form.
2004-09-22: Pair-scan feature is now zoomable, click on any single chromosome pair region to zoom in.
2004-08-20: Pair-scan permutation test is now available, it takes 90 seconds to do 500 permutations.
2004-07-15: New Pair-scan searches for pairs of chromosomal regions that may be involved in two-locus epistatic interactions is added to WebQTL
2004-06-07: Interval mapping graph in 2X resolution is now available for downloading.
2004-06-02: Three new B6D2F2 database are added to WebQTL. Dominance estimation for interval mapping with F2 data is available.
2004-05-03: Cluster qtl map display is added to WebQTL. These QTL heat maps can be drawn using three different color assignments.
2004-03-18: User is now able to add their own traits to selections, the correlation matrix and multiple mapping and some other features can be included for those traits.
Information about this text file:
This text file originally generated by RWW, March 2004.