| 
	    | 
		    | 
			    | UCSC Gene Sorter User's Guide |  
 |  |  
 
	    | 
		    | 
			    |  | 
|---|
 |  | 
				Genes function and evolve together. To understand a gene,
				you often need to understand an entire gene family.
				Many such families are already known and described, such as the 
				HOX family that mediates many aspects of limb and brain development 
				and the Cytochrome P450 family that is central to the metabolism of many 
				medications.
				 
				One easy way to identify well-known relatives of a gene 
				is by looking for genes with name similarity, because biologists 
				tend to use similar names for similar genes. However, scientists
				only partly understand the function of perhaps one-third of
				the genes of the genome; therefore, other techniques for grouping genes
				into families are necessary.  
				 
				The UCSC Gene Sorter is an excellent resource for 
				exploring gene families and the relationships among genes. This 
				tool displays a table of genes within 
				a selected genome that are related to one another. Several 
				different relationships may be explored: protein-level homology, 
				similarity of gene expression profiles, or genomic proximity. The 
				Gene Sorter supports searches on a variety of terms and phrases, 
				including the gene name, the UniProtKB protein name, a GenBank 
				accession, or a word or phrase present in a gene's description. The 
				gene family display is highly configurable, allowing the user to 
				control the order and number of columns, the number of rows, and 
				the genes displayed. The tool provides several output formats, 
				including a simple tab-delimited format that may be imported into a 
				spreadsheet or a relational database.
				 
				An important use of the Gene Sorter is to gather 
				together a collection of genes that share similar properties for
				statistical analysis.  For instance, one might want to examine
				promoter regions of genes that share a similar expression
				pattern  or look for protein sequence motifs in genes that
				share similar GO annotations.  
				 
				One of the most powerful features of the Gene Sorter is its 
				filtering capabilities. The filter enables the user to 
				quickly select an interesting subset of the 25,000 genes in 
				the genome based on a variety of detailed and flexible selection criteria. 
				For example, the filter 
				may be used to select all human genes over-expressed in the cerebellum 
				that have GO-annotated G-protein coupled receptor activity. 
				 
				The Gene Sorter was designed and implemented by Jim Kent, 
				Fan Hsu, David Haussler, and the UCSC Genome Bioinformatics Group. 
				This work is supported by a grant from the National Human Genome 
				Research Institute and by the Howard Hughes Medical Institute.
			         |  |  
 |  |  
 
	    | 
		    | 
			    |  | 
|---|
 |  | 
				To begin using the Gene Sorter, you will first
				have to select a genomic region and the type of gene relationship
				you wish to display.  
				You may also want to change some of the Gene Sorter's configuration
				settings to tailor the display to your research needs. These
				configuration options are described in Configuring
				the Gene Sorter display.
				 
				Starting the Gene Sorter
				 
				Open the Gene Sorter
				home 
				page. 
				Specify the genome and assembly you wish to view by selecting 
				the appropriate options from the genome and 
				assembly pull-down menus.
				Type a term or phrase into the search text box to
				determine which genes will be displayed in the browser. Valid search
				terms (with human genome examples) include:
				    
				    a gene name (HOXA9)
				    a UniProtKB protein name (HXA9) 
				    a word or phrase that occurs in the description of a gene (MAP kinase)
				    a GenBank mRNA accession (U14680)
				    Choose the gene relationship that you would like to examine by 
				selecting an option from the sort by pull-down menu. Genes will
				be sorted in order of proximity to the chosen gene, based on one of the
				following criteria:
				    
				    Expression (GNF Atlas1) -- similarity in gene expression, based on GNF 
					Atlas 1 data
				    Protein Homology - BLASTP -- similarity in protein homology, based on the 
					BLASTP E-value
				    Protein Homology - Rankprop -- similarity in protein homology, based
					on the Rankprop algorithm
				    Protein Homology - PSI-BLAST -- similarity in protein homology, based on 
					the PSI-BLAST E-value
				    Pfam Similarity -- similarity based on number of shared domains
				    Gene Distance -- absolute distance (left or right) on the chromosome 
					from the selected gene
				    Chromosome -- list sorted by chromosomal location
				    Name Similarity -- similarity to the name of selected gene, based on 
					the first several characters of the name
				    Alphabetical -- list sorted by gene name
				    GO Similarity -- number of Gene Ontology (GO) terms shared with 
					selected gene 
				    Choose the number of items to display from the display 
				pull-down menu (the default is 50).
				Press the Go! button to display your search results.
				 |  |  
 |  |  
 
            | 
                    |  
                            | Understanding 
                                the Gene Sorter display |  
                            |  | 
|---|
 |  | 
                                The main page of the Gene Sorter displays a table
				containing rows of genes and associated attributes. In most 
				cases, the currently-selected gene is shown at the top of 
				the list, highlighted in light green. The remaining genes
				are ordered relative to the selected gene based on the sort
				criteria specified in the sort by menu. For example,
				in a table sorted by gene distance, the genes are listed in
				order of greater to lesser chromosomal proximity to the 
				selected gene.
				 				
				The initial Gene Sorter display shows only a default subset of
				the columns available. The set of columns may be expanded,
				reduced, and rearranged by using the Gene Sorter's 
				configuration utility. To view 
				information about the data shown in the column, click on
				the column's label.
				 
				To select a different gene in the table, click on the name of
				the gene. The Gene Sorter will move the gene entry to the top of the 
				list and highlight it. The remaining genes will be reordered
				relative to the new selection.
                              	   
				Column descriptions (listed in alphabetical order)
				 
				
				# column: Displays the position of each gene in the table,
				which is useful when examining data in tables with
				many rows. Clicking on the gene number selects it and moves
				it to the top of the list.
	
				3' UTR Fold column: Shows the estimated 
				energy in kcal/mol of folding the 3' UTR of a 
				gene into the best predicted secondary structure. 
				The energy calculations and secondary structure 
				predictions were obtained from the RNAfold 
				program, which is part of the Vienna RNA 
				Package.
	
				5' UTR Fold column: Shows the estimated 
				energy in kcal/mol of folding the 5' UTR of a 
				gene into the best predicted secondary structure. 
				The energy calculations and secondary structure 
				predictions were obtained from the RNAfold 
				program, which is part of the Vienna RNA 
				Package.
  
				Abundance column (Yeast): Displays protein 
				abundance information, as reported in Ghaemmaghami
				et al. Global analysis of protein 
				expression in yeast, Nature 
				425(6959), 737-741 (2003). For more 
				information, see the 
				Yeast GFP Fusion Localization 
				Database.
  
				Arbeitman et al. 2002 Life-Cycle Expression Data column (Fruitfly):
				Shows the median ratio of gene expression in various phases of the fly life cycle
				relative to the expression of the gene in mixed egg-to-adult cultures.
				The level of detail may be
				increased or decreased with the configuration
				utility. See the Expression columns 
				description for more information about the display and configuration 
				of this column. For more details about the experiments and methods used to 
				create this data, click on the column's label.
	
				BLASTP Bits column: Shows the bit score measure of protein similarity
				between a gene and the selected gene. The greater the 
				similarity between two proteins, the higher the bit score. For
				more information about how the bit score was calculated, click
				on the Bits column label.
  
				BLASTP E-Value column: Shows the Blastp E-value (expectation value) 
				between each gene and the selected gene. The greater the 
				similarity of two proteins, the lower the E-value is. 
				Identical long proteins have an E-value of zero. 
				Formally, an E-value is the number of other known genes that 
				are expected to have at least this level of homology by chance.
				An E-value of less than 0.1 can be safely interpreted as the
				probability that a match this good would occur merely by
				chance.  For more information about how the E-value was 
				calculated, click on the E-Value column label.
				Clicking on a gene's E-value displays an alignment
				between the gene and the selected (highlighted)
				gene.
  
				C. elegans column: Shows the best Blastp match to the WormBase 
				protein set. Clicking on the ID number displays the 
				corresponding WormBase database record.
  
				Coding SNPs column: Shows the Simple Nucleotide Polymorphisms (SNPs) 
				located in the coding region of each gene. Clicking on a SNP 
				ID displays the SNP record associated with the gene in 
				NCBI's dbSNP.
  
				Description column: Shows a brief description of each gene taken from 
				its mRNA record. Clicking on a description displays a page 
				showing additional information and links for the gene.
  
				Drosophila column: Shows the FlyBase ID of the best Blastp match to 
				the FlyBase protein set. Clicking on the FlyBase ID displays
				the corresponding FlyBase report.
  
				Ensembl column: Shows the Ensembl transcript ID associated with 
				the gene. Ensembl is an automatic gene prediction pipeline and
				a major genome database and web site run by the Sanger
				Institute and the European Bioinformatics Institute (EMBL-EBI).
				It is especially effective at mapping genes that are known 
				in one organism to another organism. Compared to other gene
				predictions, those of Ensembl tend to have high specificity 
				but low sensitivity to genes not already associated with
				characterized mRNA or protein sequence. Clicking on an 
				Ensembl ID displays the Ensembl GeneView page for the gene.
  
				Entrez Gene (Human): Formerly LocusLink. Shows 
				the NCBI Entrez Gene
				ID associated with the gene. Clicking on the entry
				displays the Entrez Gene record. If the record shows a 
				link to the Online Mendelian Inheritance in Man 
				(OMIM) -- indicated by an orange square with an 
			 	"O" inside it -- an OMIM record is available for 
				the gene.
  
				Exon Count column: Displays the number of
				exons in the gene (coding and non-coding).
  
				Exp Delta column: Shows the similarity of the expression of 
				each gene to the selected gene. Genes with identical expression 
				profiles have a value of 0. This column shows data only for the 
				1000 genes (including splicing variants) that have the most 
				similar expression profiles.  For more information about how 
				expression distance was calculated, click on the Exp Delta column 
				label.
  
				
				Expression columns: Show the ratio of 		
				expression of the gene in selected tissues or life cycle stages
				to the expression of the gene overall. 
				A gene that is more highly expressed is colored red, and a less 
				expressed gene is shown in green. The values are colored on a 
				logarithmic scale. This coloring is standard, but is the opposite 
				of what an inexperienced user might expect: in this case, red means go 
				and green means stop! Black indicates that a gene is neither 
				over nor under expressed in the tissue. Uncolored boxes 
				(white on most browsers) represent missing data. 
				Various attributes of the expression column displays may be configured 
				using the configuration utility. Depending on the 
				organism, the user
				may adjust the color scheme and brightness, toggle between expression
				ratio and absolute expression values, and increase or decrease the 
				level of detail shown. In particular, color-blind users 
				may wish to switch the coloring from red/green to yellow/blue.
				For more information about the selection criteria used for expression	
				columns, click on the column's label.
  
				GenBank column: Shows the GenBank RefSeq or mRNA accession number
				associated with the gene. Clicking on the accession number 
				displays the GenBank record associated with it.
  
				Gene Ontology column: Shows the Gene Ontology (GO) terms associated with 
				the gene. GO terms are words from a controlled vocabulary 
				assigned to a gene by human curators. Clicking on a GO term 
				displays the associated Gene Ontology Consortium database 
				record.
  
				Genome Position column: Shows the chromosomal location of each gene in 
				the genome. Clicking on a chromosomal position displays the 
				gene at that location in the UCSC Genome Browser.
  
				GNF1M ID column (Mouse): Shows the 
				Affymetrix ID from the GNF1M chip that best 
				corresponds to each gene. For more details about 
				the experiments and methods used to 
				create this data, click on the column's label.
  
				GNF Atlas2 column (Human): Shows the ID of 
				the probe in the GNF Atlas 2 that overlaps most 
				with the selected gene. The GNF Altas 2 is based 
				on two Affymetrix chips: the U133A and a 
				custom-designed GNF1H chip. 
				GNF Delta column: Shows the similarity of the 
				expression of each gene to the selected gene. 
				Genes with identical expression profiles have a 
				value of 0. This column shows data only for the 
				1000 genes (including splicing variants) that 
				have the most similar expression profiles.
  
				GNF U74a, GNF U74b, GNF U74c columns (Mouse): Shows data
				from the Mouse Gene Expression Atlas from the Genomics Institute of 
				the Novartis Research Foundation (GNF) on the Affymetrix U74a, U74b,
				and U74c chips. By default, the columns display the median ratio of 
				expression in a specific set of tissues relative to the expression 
				of the gene overall. The level of detail may be
				increased or decreased with the configuration
				utility. Currently, the full spectrum of tissues is available only 
				on the U74a chip. 
				See the Expression columns 
				description for more information about the display and configuration 
				of this column. For more details about the experiments and methods used to 
				create this data, click on the column's label.
  
				GNF U95 column (Human): Shows data from from the GNF Expression 
				Atlas on the Affymetrix U95 chip. By default, the column
				displays the median ratio of expression in a specific set of tissues
				relative to the expression of the gene overall. The level of detail may be
				increased or decreased with the configuration
				utility. See the Expression columns 
				description for more information about the display and configuration 
				of this column. For more details about the experiments and methods used to 
				create this data, click on the GNF U95 column's label.
  
				Human column (Mouse): Shows that best Blastp match to the known genes
				protein set from the UCSC Human Genome Browser database. 
				Clicking the accession number displays the gene in the UCSC 
				Human Genome Browser.
				Kim-Lab Life-Cycle Expression Data (All) column (C. elegans):
				Shows the ratio of gene in all phases of the worm life cycle relative to 
				the expression of the gene in mixed wild-type adult cultures.
				See the Expression columns 
				description for more information about the display and configuration 
				of this column. For more details about the experiments and methods used to 
				create this data, click on the column's label.
				Kim-Lab Life-Cycle Expression Data (Median) column (C. elegans):
				Shows the median ratio of gene expression in selected phases of the worm life cycle. 
				See the Expression columns 
				description for more information about the display and configuration 
				of this column. For more details about the experiments and methods used to 
				create this data, click on the column's label.
  
				Max GNF Atlas 2 column: Shows the 
				maximum absolute expression level for any tissue 
				in the GNF Gene Expression Atlas 2. Most of the 
				values fall in the zero to 50,000 range, but a 
				few outliers may range as high as 200,000. A 
				value of less than 20 indicates the expression 
				could not be detected in any tissue at levels 
				significantly above the cross-hybridization 
				controls.
  
				Max GNF U95 column (Human): Shows the 
				maximum absolute expression level for any tissue. 
				Most values range between
				zero and 30,000, but a few outliers may range as
				high as 52,000. A value of less than 20 indicates
				the expression could no be detected in any 
				tissue at levels significantly above the 
				cross-hybridization controls. 
                              	  
				Max Rinn Sex column (Mouse): Shows the 
				maximum expression value in adult male and female 
				mouse tissue as described in (Rinn et al., 
				Developmental Cell, 2004). For more information
				about the methods used to generate these data,
				click on the Max Rinn Sex column header.
					
				Module column (Yeast): Shows the predicted 
				regulatory module (a combination of transcription 
				factor binding sites) that regulates the gene. 
				The approach underlying this annotation is 
				described in Segal, E. et al., 
				Genome-wide discovery of 
				transcriptional modules from DNA sequence and gene
				expression, Bioinformatics 
				19(Suppl 1), i273-i282 (2003). For more
				information about the methods used, click the
				Module column header. To view genes that share
				a regulatory module, select the Regulatory 
				Module option from the sort by 
				menu.
	
				MOE430 ID column (Mouse): Shows the 
				Affymetrix ID from the MOE430 series of chips 
				(A & B) that best corresponds to each gene.
				For more details about the experiments and methods
				used to create this data, click on the column's 
				label.
	
				Mouse column (Human): Shows the accession 
				number of the best Blastp match to the known genes 
				protein set in the UCSC Mouse Genome Browser database. 
				Clicking on the accession number displays the gene in 
				the UCSC Mouse Genome Browser.
    				Name column: Displays the name of the gene. When possible, 
				the HUGO Gene Nomenclature Committee (HGNC) name is shown. If the gene does not yet have an HGNC 
				name, the GenBank accession number of the associated RefSeq 
				or mRNA record is shown instead. Clicking on the gene name
				selects it and moves it to the top of the list.
  
				% ID column: Shows the percentage identity at the protein 
				level between the gene and the selected gene. For more 
				information about how the % ID was 
				calculated, click on the % ID column label.
  
				PDB column: Displays all Protein Data 
				Bank (PDB) IDs associated with the gene. PDB is a
				database of proteins with known 3-D structures. 
				In some cases these records will correspond 
				to only a fragment of the gene. In other cases 
				the PDB record may include other molecules with 
				which the protein interacts. Clicking on a PDB
				entry displays the associated PDB Structure
				Explorer page.
  
				Pfam Domains column: Shows a list of 
				Protein Family (Pfam) domains contained in the gene 
				product. Clicking on a domain displays the associated 
				Pfam record.
  
				PSI-BLAST E-Value column: Shows the 
				PSI-BLAST E-value (expectation value) between the
				UniProtKB or TrEMBL protein associated with the 
				gene and the protein associated with the selected 
				(highlighted) gene. The greater the similarity of 
				two proteins, the lower the E-value is. Identical 
				long proteins have an E-value of zero. 
				For more information about how the E-value was 
				calculated, click on the E-Value column label.
				Clicking on a gene's E-value displays an alignment
				between the gene and the selected (highlighted)
				gene.
  
				Rankprop column: Displays protein 
				similarity scores assigned by the Rankprop 
				algorithm. The scores reported in this column 
				range from zero to one, with one being the most 
				significant. Currently, Rankprop does not report 
				an E-value statistic. For more information on this
				algorithm, click on the Rankprop column
				label.
  
				RefSeq column: Displays the NCBI RefSeq 
				accession associated with the gene, if available. 
				RefSeq genes are a non-redundant set of
				high-quality mRNA sequences. Clicking on the accession number
				displays the NCBI Entrez Gene record for a RefSeq accession. If
				the record shows a link to the Online Mendelian Inheritance in
				Man (OMIM) -- represented by an orange box with an "O"
				on it -- an OMIM record is available for the gene. OMIM is 
				often an excellent source of human-curated information about
				human genes.
				Regulatory Motif column (Yeast): Shows the 
				predicted transcription factor binding sites that 
				correlate with expression of the gene. 
				regulatory module (a combination of transcription 
				factor binding sites) that regulates the gene. 
				The approach underlying this annotation is 
				described in Segal, E. et al., 
				Genome-wide discovery of 
				transcriptional modules from DNA sequence and gene
				expression, Bioinformatics 
				19(Suppl 1), i273-i282 (2003). For more
				information about the methods used, click the
				Module column header.
	
				SGD ORF: Shows the Saccharomyces Genome 
				Database (SGD) ID associated with the gene.
  
				SP Acc column: Shows the UniProtKB 
				protein accession of a gene. Clicking on an entry
				displays the corresponding UniProtKB NiceProt
				view of the protein.
                              	  
				Superfamily column: Displays a list of 
				Structural Classification of Protein (SCOP)
				superfamilies associated with a protein. The gene 
				set was mapped to SCOP superfamilies using the 
				Superfamily HMM library. Clicking on an entry
				displays the associated Superfamily record.
  
				UniProtKB column: Displays the UniProtKB 
				protein name of each gene, 
				if it is available. Otherwise, it shows the primary 
				accession number. Clicking on a protein name or accession 
				displays the corresponding UniProtKB NiceProt view of the
				protein.
  
				U133 ID column (Human): Shows the Affymetrix ID 
				from the HG-U133 chip that best 
				corresponds to each gene. For more information about 
				the selection criteria used for this column, click on 
				the U133 ID column's label.
  
				U133Plus2 ID column (Human): Shows the 
				Affymetrix ID from the HG-U133 Plus 2.0 chip that 
				best corresponds to each gene. For more 
				information about the selection
				criteria used for this column, click on the 
				U133Plus2 ID column's label.
  
				U74 ID column (Mouse): Shows the Affymetrix ID 
				from the U74 series of chips (a, b, c) that best 
				corresponds to each gene. For more information about 
				the selection criteria used for this 
				column, click on the U74 ID column's label.
  
				U95 ID column (Human): Shows the Affymetrix ID 
				from the HG-U95 chip that best corresponds to each gene. 
				For more information about the selection criteria used 
				for this column, click on the U95 ID column's label.
  
				UCLA Long Expression column (Human): Shows UCLA 
				expression data from normal tissues on the U133 chip. 
				This column shows the ratio of expression of a gene in the entire 
				tissue set relative to the expression of the gene overall.
				See the Expression columns 
				description for more information about the display and configuration 
				of this column. For more information about the methods used to 
				generate this data, click on the UCLA Long Expression column's label.
  
				UCLA Short Expression column (Human): Shows UCLA 
				expression data from normal tissues on the U133 chip. 
				This column shows the ratio of expression of a gene in a particular
				subset of tissues relative to the expression of the gene overall.
				See the Expression columns 
				description for more information about the display and configuration 
				of this column. For more information about the methods used to 
				generate this data, click on the UCLA Short Expression column's label.
  
				WormBase column (C. elegans): Shows the ORF name 
				associated with each gene.  Clicking on an 
				ORF name displays the associated WormBase record.
  
				Yeast column: Shows the best Blastp match to the Saccharomyces 
				Genome Database (SGD) protein set. Clicking on the yeast ID 
				displays the corresponding SGD record.
  
				Zebrafish column: Shows the Ensembl peptide ID of the best Blastp 
				match to the Ensembl gene predictions on the zebrafish genome. 
				Clicking on the Ensembl peptide ID displays it in Ensembl 
				Zebrafish Protein View.
				 |  |  
 |  |  
 
	    | 
		    | 
			    | Configuring the 
				Gene Sorter display |  
			    |  | 
|---|
 |  | 
				The Gene Sorter is highly configurable, allowing you
				to fine-tune the display to show just the genes and data
				columns in which you're interested in an order that best suits your research
				needs. Most of the configuration is controlled through settings on
				the Configuration page, accessed via the configure button at
				the top of the Gene Sorter page.
				 
				Changing the number of rows displayedTo increase or decrease the number of rows shown in the table, pick a
				new value from the display pull-down menu, then click the Go! 
				button.
 
                                Changing the number of columns displayedBy default, the Gene Sorter shows only a small subset of the
				table columns available for the genome. You can view the full list of columns,
				or add or remove columns from your display, on the Configuration page.
 The configuration table shows all the
				columns available for the currently-selected genome, listed in left-to-right
				display order. To add or remove a column from the Gene Sorter display, click the
				On checkbox to toggle the setting (a check indicates that the column
				is displayed). To quickly change the On settings of all columns,
				click the Hide All or Show All button at the top of the page. Click the 
				Submit button to display the changes in the Gene Sorter.
				 
				Changing the column positionsIn addition to adding or removing columns, it is 
				also possible to move the columns to the left or right within the Gene Sorter
				table. The order of the column names in the configuration table indicates the
				current relative position of the columns in the Gene Sorter display from left to
				right. To shift a column one position to the left, click the up arrow in the
				item's Position column. Similarly, click the down arrow to shift a 
				column to the right. When you have finished making changes, click the Submit
				button.
 
				Changing the expression colorsBy default, the gene expression ratios are shown using a red/green color 
				scheme,
				where red indicates a gene that is more highly expressed and green corresponds
				to less expression. Color-blind users may find it helpful to switch the 
				coloring from red/green to yellow/blue. To do so, select the "yellow high/blue
				low" option from the Expression ratio colors pull-down menu on the 
				Configuration page, then click the Submit button.
 
				Changing the brightness of expression colorsTo increase or decrease the brightness of the colors in an expression 
				column, edit the brightness value for the corresponding entry in the 
				configuration table. Values greater than 1.0 increase the brightness, while 
				those less than 1.0 dim the color. Click the Submit button to display the 
				new values.
 
				Changing the type of tissue data shown in expression columnsBy default, the expression columns show the median ratio of expression
				of a gene in a small selected set of tissues. Use the the tissues 
				pull-down menu to configure the tissue display for the column. 
				The "all replicas" option will show the value
				of each individual experimental replica of each tissue. The "median 
				of replicas" option displays a single value for each tissue that 
				represents the median of all replicas for that tissue.
 
				Toggling between ratio and absolute expression values By default, expression columns show the ratio of expression of a gene relative
				to expression of the gene overall. To view absolute expression values instead,
				select the "absolute" option from the values pull-down menu.
 
				Displaying splicing variantsBy default, the Gene Sorter shows only one splicing variant: the one that produces
				the largest protein. To show all splicing variants, click the Show all 
				splicing variants checkbox. Note that in most cases, the column values
				(and sometimes the names) will be identical across variants.
 
				Restoring the default settingsAt any time during your Gene Sorter session, you can restore the Gene Sorter table
				to its default layout by clicking the Default button on the Configuration page,
				then clicking Submit.
 
				Saving a configuration for future useThe Gene Sorter configuration utility allows you to store multiple configurations
				for use in future sessions. This feature is particularly useful if you require
				different layouts for different research uses.
				To save the current configuration of the Gene Sorter layout, click the Save button
				on the Configuration page. Type in a name for the configuration in the text
				box at the top of the page, then click Save.
 
				Loading a previously-saved configurationOnce you have saved a configuration, you can load it back into your Gene Sorter
				in a future session. To load a configuration, click the Load button on the
				Configuration page. The Gene Sorter will display a list of the names of your saved 
				configurations. Click on a name to highlight it, then click Load to 
				reconfigure your Gene Sorter based on the saved settings.
 
				Viewing a list of saved configurationsTo display a list of configurations that you have saved, click the Save button
				on the Configuration page. If you have any saved configurations, the Gene Sorter 
				will display an Existing Setups list that shows the configuration names. To
				permanently remove a configuration from the list, click on the name to 
				highlight it, then click the Delete Existing Setup button.
 |  |  
 |  |  
 
	    | 
		    | 
			    | Filtering the gene display |  
			    |  | 
|---|
 |  | 
				The Gene Sorter's gene filtering capabilities provide a versatile
				way to fine-tune the display to show just the genes in which you are
				interested. Filters are applied to individual gene fields, and may be
				combined to increase the specificity of the search. To access the Filter page, 
				click the filter button at the top of the Gene Sorter page. 
				 
				At any time during the filter setup process, you can click the List Names
				button on the Filter page to view a list of genes
				that will be returned when the current filter settings are applied to the
				genome. You may find this list helpful in fine-tuning the filter. 
				 
				Filtering based on matching one or more termsFilters based on names, IDs, or other words restrict the display 
				to only those genes that match one or more terms typed into the 
				search text box. 
				Examples of values that can be filtered on this basis include the gene
				name, RefSeq accession number, gene description, coding SNPs, and GO terms.
 
				This search supports wildcard matching on "*" and "?".
				Multiple terms must be separated by a space or tab. For example,
				the search criteria "HOXA9 FOX*" on the gene name field returns the gene 
				named HOXA9 and any gene whose name begins with the letters "FOX". 
				When searching on fields that consist of values containing more than one 
				word (GO terms, coding SNPs, Pfam domains, and gene descriptions), the 
				multi-word
				elements must be enclosed in single quotes. For instance, a search on the
				description phrase "forkhead box protein" should be entered as
				'forkhead box protein'.
				Use the 
				"any" and "all" options to determine whether the search should 
				return any gene that matches any term ("any") or only those genes that 
				match all terms ("all").
				 
				To facilitate searching on multiple terms, the Gene Sorter provides
				the option to paste in or upload a list of search terms. To paste in a list
				of terms, click the filter's Paste List button, then paste or type the 
				terms into the text box. Terms must be separated by a space, a tab, or be 
				entered on separate lines, and may not include wildcards. When you have 
				completed the list, click the Submit button to return to the main Filter page.
				The file upload utility - accessed via the Upload List button - has a similar
				functionality.
				 
				Filtering based on numerical rangesSeveral of the gene fields can be filtered by specifying a numerical range
				within which the value must fall. Examples of
				fields in this category include expression ratios, Blastp data, and genome
				position. To use this type of filter, enter the minimum and maximum
				values delimiting the range in which you are interested. In some cases, the
				range of valid values is indicated in the filter box.
 
				The genome position filter requires the name of a chromosome (in the 
				format chrN) in addition
				to the chromosomal start and end positions. To list all genes on a chromosome,
				enter only the chromosome name.
				 
				Expression filters include "any" and "all" options to 
				determine whether the search should 
				return a gene if any of the tissue expression values meet the minimum and
				maximum criteria ("any") or only if all tissue expression values meet 
				the search criteria ("all").
				 
				Saving filter settingsThe Gene Sorter provides a mechanism for saving filter settings for use in future
				sessions. To preserve the current filter configuration, click the Save Filter
				button on the Filter page. Type in a name for the filter, then click Save
				to save the filter and return to the Filter page.
 
				Loading a saved filterOnce a filter configuration has been saved, you can retrieve it in later
				sessions by loading it back into your Gene Sorter. To load the saved filter
				settings, click the Load Filter button on the Filter page. Click on the
				name of the filter you wish to load, then click the Load button. Click
				the Submit button on the Filter page to apply the filter settings to the
				Gene Sorter.
 
				Viewing a list of saved filtersTo display a list of filter settings that you have saved, click the Save button
				on the Filter page. If you have any saved filters, the Gene Sorter 
				will display an Existing Setups list that shows the filter names. To
				permanently remove a filter from the list, click on the name to 
				highlight it, then click the Delete Existing Setup button.
 |  |  
 |  |  
 
	    | 
		    | 
			    | Displaying sequence and 
				text-based output |  
			    |  | 
|---|
 |  | 
				The Gene Sorter's graphical presentation of data facilitates the visual
				observation of relationships and patterns among the genes in the
				display. However, it is often useful to convert the data to a text-based
				format that can be easily saved to a file or loaded into another program,
				database, or spreadsheet for further analysis. The Gene Sorter provides
				a mechanism for saving the current display in a tab-delimited text file or
				showing a text-based view of the sequence underlying the current display.
				 
				Creating text-based outputTo output the current Gene Sorter table as text, click the text button at the
				top of the page. The Gene Sorter will display each row of table data 
				on a separate tab-delimited line.
 
				Viewing the underlying sequenceTo display the protein, mRNA, or genomic sequence underlying the current
				Gene Sorter table, click the sequence button at the top of the 
				page. On the Get Sequence page, select the desired sequence configuration 
				settings that you'd like, then click the Get Sequence button. The Gene Sorter will 
				display a text-based list of FASTA format records for each gene displayed
				in the table. The FASTA records may be cut and pasted into
				Blat for further study.
 |  |  
 |  |  |