| 
      | 
          | 
	      | Table Browser User's Guide |  
 |  |  
     
      | 
          | 
	      |  | 
|---|
 |  | 
		The Table Browser provides a powerful and flexible graphical
		interface for querying and manipulating the Genome Browser 
		annotation tables. Because the Table Browser uses the same 
		database as the Genome Browser, the two views are always 
		consistent.  
		 
		Using the Table Browser, you can: 
		 
		
		retrieve the DNA sequence data or annotation data underlying 
		Genome Browser tracks for the entire genome, a specified 
		coordinate range, or a set of accessions
		
		apply a filter to set constraints on field values included in 
		the output
		
		generate a 
		custom
		track and automatically add it to your session so that it 
		can be graphically displayed in the Genome Browser
		
		conduct both structured and free-from SQL queries on the data
		
		combine queries on multiple tables or custom tracks through an 
		intersection or union and generate a single set of output data 
		
		display basic statistics calculated over a selected data set 
		
		display the schema for table and list all other tables in 
		the database connected to the table
		
		organize the output data into several different formats for use
		in other applications, spreadsheets, or databases
		 
		This User's Guide is aimed at both the novice Table Browser user
		as well the advanced user. If you are new to the Table Browser,
		read the Getting started section to learn about browser basics
		and try some simple queries. Advanced users may want to proceed
		directly to the section that addresses a particular 
		area of functionality in detail.
	         
		Although the Table Browser provides sufficient flexibility to 
		satisfy the needs of most users, some advanced users may require
		the ability to run MySQL directly on the Genome Browser database.
		UCSC provides a public MySQL server at 
		genome-mysql.cse.ucsc.edu.
		Alternatively, the database may be downloaded to a local computer
		for MySQL access. See the 
		mirror site
		documentation for information on setting up a local copy of the 
		database.
	       |  |  
 |  |  
     
      | 
          | 
	      | About the Table Browser databases and tables |  
	      |  | 
|---|
 |  | 
		The Table Browser is built on top of the Genome Browser 
		database, which actually consists of several separate 
		databases, one for each genome assembly.  
		 
		Tables within the databases may be differentiated by whether the
		data are based on genome start-stop coordinates 
		(positional tables) or are independent of position
		(non-positional tables). 
		Some output formats and query options are applicable only to 
		positional tables, hence the distinction.
		 
	         Non-positional tables
 
		Non-positional tables contain data not tied to genomic 
		location, for example a table that correlates a Known Gene 
		ID with a RefSeq accession ID. Some non-positional tables relate 
		internal numeric mRNA IDs to extended information such as author, 
		tissue, or keyword.  Some "meta" tables in this category
		contain information about the structure of the database itself 
		or describe external files containing sequence data.  
	         
	         Positional tables
 
		Positional tables contain data associated with specific 
		locations in the genome, such as mRNA alignments, gene 
		predictions, cross-species alignments, and other annotations.
     		Each of the annotation tracks displayed in the Genome Browser
		is based on a positional table. In some instances, data from
		other positional and non-positional tables may also be 
		incorporated into the track. Data associated with custom 
		annotation tracks active within the user's Table Browser session
		are also available as positional tables.
		 
		Positional tables can be further subdivided into several 
		categories based on the type of data they describe. Alignment
		data can be best described by using a block structure to represent
		each element. Other tables require only start and end 
		coordinate data for each element. Some tables specify a 
		translation start and end in addition to the transcription start 
		and end. Some tables contain strand information, others don't.  
		Most tables, but not all, specify a name for each element.  
		Based on the format of the data described by a table, different 
		query and output formatting options may be offered.  
		 
		For descriptions of the Genome Browser database tables, 
		see the annotation 
		database documentation.
	       |  |  
 |  |  
     
      | 
	  | 
	      | Getting started - simple queries |  |  | 
|---|
 |  | 
		In its most basic form, the Table Browser can be used to 
		retrieve a specific subset of records from a track or positional
		table in a selected genome assembly. The query may be based on
	        a specific position or a set of one or more identifiers.
	         
	        This section describes the steps required to conduct basic
	        simple data queries using the Table Browser. Once you have
	        mastered the basic Table Browser functionality, refer to 
	        subsequent sections for information about generating more 
	        complex queries that use filters, intersections, and alternative 
	        data output formats.
	 	 
	 	 Simple position-based query
 
		Follow these steps to display a list of records that lie
		within a specific position in a table:
		 
		Step 1. Pick a genome assemblySpecify the genome assembly from which you'd like to retrieve the
		data by choosing the appropriate organism in the genome
		list, then selecting the assembly version from the 
		assembly list. Note that the assembly list 
		refreshes each time a different option is selected in the 
		genome list.
 
		Step 2. Pick an annotation trackThe group list shows all the annotation track groups 
		available in the selected genome assembly. The names correspond
		to the groupings displayed at the bottom of the Genome Browser
		annotation tracks page. When a group is selected from the list,
		the track list automatically updates to show all the 
		annotation tracks available within that group.
 
		  
		  If you already know the name of the annotation track in which 
		  you're interested, select the All Tracks option in the 
		  group list, then select the track from the 
		  track list.
		  Similarly, you can directly select a table by 
		  choosing the All Tables option in the 
		  group list, selecting a database from the 
		  database list, then selecting the table from 
		  the table list.
		  
		  To examine all the tracks available within a certain group (e.g.
		  all gene prediction tracks), select the group name from the
		  group list, then browse the entries in the 
		  track list.
		  
		  Custom annotation tracks created during the current session
		  are listed under the Custom Tracks group.
		  
		  If no selections are made from the group or 
		  track lists, the track selection defaults to the
		  Known Genes track in the Genes and Gene Prediction Tracks group.
		   
		Step 3. Pick a tableThe table list shows all tables (both positional
		and non-positional) associated
		with the currently-selected track. By default, it 
		displays the primary table for the track, i.e. the table
		containing the data shown in the Genome Browser
		annotation track. Other tables in the list are linked to the
		primary table by a common field and may provide supporting
		data used in constructing the annotation.
 
		
		If the group list is set to the 
		All Tables option, the tables list will show all tables
		present in the database currently selected in the 
		database list, rather than those associated with
		a particular track.  
		 
		Step 4. Pick a genomic region (positional tables only)By default, the Table Browser region is set to genome,
		which will display all the data records in the selected table.
 
		  
		  To restrict the data to a specific position
		  range, type the position into the position box. Some
		  examples of specific positions include a chromosome name 
		  (chrX), a coordinate range within a chromosome
		  (chrX:100000-400000), or a scaffold name.
		  
		  To look up the position range of a genomic element -- such
		  as a gene name, an accession ID, an STS marker, etc. -- or 
		  keywords from the GenBank description of an mRNA,
		  type the string into the position box, then 
		  click the Lookup button.
		  
		  The data in non-positional tables are not tied to genomic
		  coordinates; therefore, the region option is
		  unavailable when a non-positional table is selected. A basic 
		  query on a non-positional table will show all the data in the 
		  table.
		   
		Step 5. Display the outputClick the Get Output button to display the 
		results of the query. By default, the Table Browser outputs the 
		data from all fields in the selected table as tab-separated text
		on the screen. See the Output Formats
		section for information on configuring the query output.
 
		Example:Here is an example of a simple query that retrieves all the
		RefSeq Genes records in the position range 
		chr7:26906938-26940301 on the May 2004 human genome assembly.
 
		
		Select the Human option in the genome list
		
		Select the May 2004 option in the assembly list
		
		Select the Genes and Gene Prediction Tracks option in the
		group list.
		
		Select the RefSeq Genes option in the
		track list.
		
		Type chr7:26906938-26940301 in the 
		position box (the Table Browser will 
		automatically select the position option button).
		
		Click the Get Output button.
		 
		The Table Browser will display the records for the RefSeq
		accessions NM_005522, NM_153620, NM_006735, NM_153632, NM_030661,
		and NM_153631.
	         
	         Batch query using identifiers
 
		In many cases, you may want to retrieve data based on a list of 
		one or more accessions or names, rather than querying by
		genomic position. Many tracks in the Table Browser, such as
		those in the Genes and Gene Prediction track group, 
		support identifier queries. The identifier type used in the
		query must match the kind of identifiers present in the track data, e.g.
		mRNA accession IDs must be used to query the mRNA table.
		 
		Follow these steps to display a list of 
		records that correspond to a set of accessions or names entered 
		as query input.
		 
		Step 1. Pick the genome assembly, track, and table
		 
		Step 2. Select the genome region setting
		 
		Step 3. Load the identifiers into the browserClick the Paste List 
		button to type or paste in the identifiers or the 
		Upload List button to load the data from a
		file existing on your local computer.
 
		If you are loading 
		multiple identifiers, entries must be separated by a space, tab,
		or line. 
		Wildcards may not be used in the list (see the
		Filter section for information about
		conducting queries that include wildcards). 
		The Table Browser
		will retain the identifier list until you delete the
		information by clicking the Clear List button.
		 
		Step 4. Click the Get Output buttonSee the Output Formats
		section for information about configuring the query output.
 |  | 
 
 |  |  
     
      | 
	  | 
	      | Filtering output by constraining field values |  |  | 
|---|
 |  | The Table Browser filter option can be used to: 
	      
	      apply constraints on table field values to restrict which records
	      should appear in the query output
	      
	      conduct batch queries using wildcards
	      
	      include fields from multiple tables in the query output
	       
	       Filtering on fields from a single table
 
	      Follow these steps to create a filter on one or more fields in a 
	      single table:
	       
	      Step 1. Select the assembly, track, and region
	       
	      Step 2. Click the Create button on the 
	      filter line
	       
	      Step 3. Add the filter constraintsOne or more of the fields in the currently selected table may
	      be filtered by typing constraints into the corresponding text
	      boxes.
 
	      By default, the initial values set up in the filter match all 
	      records in the table. 
	      Constraints must match the data type of the field to be applied
	      successfully. For example, the geneName field in the hg17 
	      refFlat table is a string; therefore, constraining values
	      must also be strings. See the Filter
	      constraints sections for more information on valid filter 
	      values.
	      Multiple filter values may be applied against one field by
	      separating the values with spaces.
	      Individual field constraints are combined with AND, 
	      i.e. a record must meet the constraints on all fields to be 
	      retrieved. 
	       
	      Step 4. Click the Submit button to apply the filter
 
	      Once a filter has been created on a table, it will persist for the 
	      duration of the Table Browser session or until it has been 
	      cleared. Only one filter can exist for a table at a time, but multiple filters may exist in one session if they are applied on different tables.
	      To modify an existing filter, click the Edit 
	      button on the filter line. To remove a filter,
	      click the Clear button.
	       
               Filtering on fields from multiple
	      tables
 
	      A Table Browser filter may include constraints on fields from
	      tables related to the primary table. To create a filter composed
	      of fields from multiple tables:
	       
	      Step 1. Select the assembly, track, and region
	       
	      Step 2. Click the Create button on the 
	      filter lineNote: If a filter already exists on the table, 
	      click the Edit 
	      button to modify it or the Clear button to
	      remove it.
 
	      Step 3. Select the tables to include in the filterScroll down to the Linked Tables
	      section of the page. The tables listed in this section are
	      linked to the selected table by one or more common fields (typically
	      a name, accession, or ID field). Click the boxes in front of the
	      table(s) whose fields you wish to include in the filter,
	      then click the Allow Filtering Using Field in Checked 
	      Tables button. 
	      The fields of the selected tables will be displayed in the top
	      portion of the page.
 
	      Step 4. Add the filter constraints
 
	      Step 5. Click the Submit button to apply the filter
 
	      Note: In the current implementation of the Table Browser,
	      the selected fields from primary and related tables output
	      format option must be used when including fields from multiple
	      tables in a filter. Check the boxes for all tables in the 
	      Linked Tables list on which filter constraints 
	      have been applied, then click the Allow Selection From
	      Checked Tables button to include them in the output.
	       
	       Filter constraints
 
	      StringsText fields are compared to words or patterns containing wildcard 
	      characters. Valid wildcards are "*" (matches 0 or more characters) 
	      and "?" (matches a single character). Each space-separated 
	      word or pattern in a text field box is matched against the 
	      value of that field in each record. If any word or pattern matches 
	      the value, then the record meets the constraint on that field.
 
	      NumbersNumeric fields are compared to table data using an operator such 
	      as <, >, != (not equals) followed by a number. To specify a range, 
	      enter two numbers (start and end) separated by white space and/or 
	      a comma.
 
	      Free-form queriesWhen the filters on individual fields aren't sufficiently flexible,
 	      the free-form query text box allows the application
	      of more complex constraints that typically relate two or more 
	      field names of the selected table. Valid free-form queries use 
	      the syntax of the SQL 
	      where clause (using
	      wildcards as defined above).
 
	      Free-form queries combine simple 
	      constraints with AND, OR, and NOT 
	      using parentheses as needed for clarity.  A simple constraint 
	      consists of a table field name, a comparison operator (see below), 	         and a value: a number, string, wildcard value (see below), or 
	      another field name. In place of a field name, you may use an 
	      arithmetic expression of numeric field names. 
	       
	       
	      
	      String or wildcard values for text comparisons must be quoted. 
	      Single or double quotes may be used. If comparing to a literal 
	      string value, use the "=" or "!=" operator. If comparing 
	      to a wildcard value, use the "LIKE" or "NOT LIKE" 
	      operator. 
	      
	      Numeric comparison operators include <, <=, =, != (not equals), 
	      >=, and >. 
	      Arithmetic operators include +, -, *, and /. 
	      Other SQL comparison keywords may also be used. 
	       
	      Example:The following examples show free-form queries applied to the 
	      human refGene table).
 
	      
	      txStart = cdsStart  - 
	      searches for gene models 
	      missing expected 5' UTR upstream sequence (if strand is '+'; 3' UTR
	      downstream if strand is '-')
	      
	      chrom NOT LIKE "chr??"
	      - restricts search to chromosomes 1 - 9,  X and Y
	      
	      cdsEnd - cdsStart) > 10000
	       - selects genes with coding spanning more than 10 kbp
	      
	      txStart != cdsStart) AND 
	      (txEnd != cdsEnd) AND exonCount = 1 -  
	      finds single exon genes with both 3' and 5' flanking UTR.
	      
	      cdsEnd - cdsStart) > 30000) AND 
	      (exonCount=2 OR exonCount=3) - finds genes with 
	      long spans but only 2 - 3 exons
	       |  | 
 
 |  |  
     
      | 
	  | 
	      | Intersecting data from multiple tables |  |  | 
|---|
 |  | It is often interesting to compare the positions of features
	      in different annotation tracks to identify points of overlap. The 
	      Table Browser intersection 
	      utility can be used to generate various position-based 
	      comparisons of track features. Using the 
	      intersection utility, you can: 
	      
	      examine all genomic positions where the feature data from the two 
	      tracks overlap
	      
	      identify genomic locations where there is no overlap between 
	      track features
	      
	      establish thresholds for the amount of overlap that must exist 
	      between the two feature sets
	      
	      conduct feature-by-feature comparisons as well as
	      base-by-base comparisons of tracks
	      
	      complement (invert) a position set before comparing the tracks 
	       
	      An 
	      intersection may be expanded to include additional tables by using 
	      the Table Browser custom track feature.
	       
	      Note: The intersection utility can be used 
	      only on positional tables. To 
	      generate intersections incorporating data in non-positional tables, 
	      use the Table Browser filter utility. See the 
	      Filtering on fields from multiple 
	      tables section for more information.
	       
	       Intersecting data from two tables
 
	      Follow these steps to configure and generate an intersection between
	      two positional tables:
	       
	      Step 1. Select the assembly, track, table, and region for the 
	      primary tableNote: Only positional tables may be used in an intersection.
 
	      Step 2. Click the Create button on the 
	      intersection lineNote: If an intersection already exists on the table, 
	      click the Edit 
	      button to modify it or the Clear button to
	      remove it.
 
	      Step 3. Select the secondary track to include in the filterSelect a group in the group list, then select a 
	      track from the track list. To view all the
	      tracks available, regardless of group, select the 
	      All Tracks option in the group list.
 
	      Step 4. Select a combination methodThe Table Browser provides two major types of comparisons:
 
	      Click the circle in front of a combination method to select it. 
	      Only one method may be selected from the two sets of methods. For
	      more information about the individual combination options, see the 
	      Intersection Options section.
	      feature-by-feature comparisons preserve the structure of 
	      the primary table. For example, if the primary 
	      table describes exon structure and the features are compared
	      with a second table, the results will describe exon structure 
	      (unless you choose an output format in which the structure is 
	      lost). 
	      
	      base-by-base comparisons examine 
	      the primary table and the table underlying the secondary track one 
	      base at a time. The structure 
	      of the primary table is not preserved in this comparison. For 
	      example, even if the primary table describes 
	      exon structure, the intersection results will contain only position 
	      ranges; no information about exon/block structure, strand, or 
	      translation region will be retained. 
	       
	      Step 5. (optional) Select the complement optionsCheck the box in front of one or both tables to complement the
	      feature data in the The complement options allow you to invert the 
	      set of positions
	      covered by one or both tables. For example, if you choose to 
	      complement the primary track, any position covered by the that
	      track's features will be considered not covered, and vice 
	      versa. This option provides more flexibility in comparing track 
	      positions.
 
	      Step 6. Click the Submit button to apply the 
	      intersectionOnce an intersection has been created on a table, it will persist 
	      for the duration of the Table Browser session or until it has been 
	      cleared. Only one intersection may exist at a time. 
	      To modify an existing intersection, click the Edit 
	      button on the intersection line. To remove an 
	      intersection, click the Clear button.
 
	       Intersecting data from more than two tables
 
	      The Table Browser intersection utility limits
	      combinations to only two tables. An existing intersection may be
	      expanded to include additional tables by using the Table Browser
	      custom track utility. To create an intersection on multiple tables:
	       
	      Step 1. Set up an intersection between two tablesSee the Intersecting data 
	      from two tables section for more information.
 
	      Step 2. Save the intersection data in a custom trackSee the Saving data as a custom track
	      section for information on generating a custom track. 
	      Note: In the 
	      current implementation of the Table Browser, you must use the
	      Get Custom Track button on the custom track page
	      to add the custom track to the Table Browser track
	      list.
 
	      Step 3. Select the newly-generated custom trackSelect the Custom Tracks option in the 
	      group list, then select the newly-created custom
	      track from the track list.
 
	      Step 4. Create an intersection with another trackFollow the steps in the 
	      Intersecting data from two tables 
	      section to intersect the custom track with another track.
 
	       Intersection options
 
	      Feature-by-feature comparisonsSome comparisons preserve the primary table's gene and alignment 
	      structure, if it exists. For example, if the refGene
	      table (human RefSeq Genes track) is combined with another
	      table  using one of these comparisons, the resulting output data
	      will describe exon structure (unless you choose an output format 
	      in which the structure is lost). Primary table features are kept 
	      or discarded based on the amount of positional overlap with 
	      the features in the table underlying the secondary track. The 
	      Table Browser offers the following options in this category:
 
	      
	      Any overlap: A primary table record will appear in the 
	      output if any of its base positions are covered by any 
	      feature in the secondary table. 
	      
	      No overlap: A primary table record will appear in the 
	      output only if none of its base positions are covered by any 
	      feature in the secondary table. 
	      
	      Overlap greater than a specified threshold: A primary table 
	      record will appear in the output if the percentage of its 
	      base positions covered by secondary table features is greater than 
	      the user-specified threshold. 
	      
	      
	      Overlap less a specified threshold: A primary table record 
	      will appear in the output if the percentage of its base positions 
	      covered by secondary table features is less than the user-specified
	      threshold. 
	       Note: If the primary table has an exon/block structure, 
	      only those bases located in exons and/or blocks will be counted. 
	       
	      Base-by-base comparisonsIn these combination options, the positions of the primary 
	      and secondary table features are compared one base position at a 
	      time. When applying base-by-base comparisons, the structure of the 
	      primary table is not preserved. For example, if the refGene
	      table (from the human RefSeq Genes track) is compared with
	      a secondary table using these comparisons, the resulting output
	      data will not describe exon structure. Instead, only position 
	      ranges will be returned; the exon/block structure, strand, and
	      translation region information will be discarded. The Table Browser
	      provides the following base-by-base combination options:
 
	      
	      Base-by-base intersection (AND): A nucleotide position 
	      is included in the output if it is covered by at least one feature 
	      of both the primary table and the secondary table.
	      
	      Base-by-base union (OR): A nucleotide position is 
	      included in the output if it is covered by at least one feature of 
	      either the primary table or the secondary table. 
	       
	       Note: If the primary table has an exon/block structure, 
	      only base positions located in exons and/or blocks will be counted. 
	       
	      Base-by-base complement (NOT)Before the Table Browser applies a feature-by-feature or 
	      base-by-base comparison to the table data, the set of positions 
	      covered by one or both tables can be inverted (complemented). When 
	      the data set of a table is complemented, any position 
	      covered by the table's features in the original data will be 
	      considered not covered in the inverted data, and vice versa. 
	      This option gives the user more flexibility in comparing table 
	      positions.
 |  | 
 
 |  |  
     
      | 
	  | |  | 
|---|
 |  | 
	      The data resulting from a Table Browser query may be configured in
	      a number of different ways: 
	       
	      The output options available for a specific query may vary
	      depending on the table(s) selected. For example, non-positional
	      table data cannot be organized in a position-based format, but 
	      instead may be displayed only in tab-separated format. The Table 
	      Browser will automatically update the options on the 
	      output format list to show only those available
	      for the current query.
	      The output can be displayed on the
	      screen, saved to a file, or saved to an annotation track table 
	      that can be displayed in the Genome Browser or used in a subsequent
	      Table Browser query.
	      
	      The data can include all fields from the primary or selected table, 
	      or can be restricted to selected fields from the
	      primary table and related tables.
	      
	      The data can be organized in one of several formats: tab-separated, 
	      sequence (FASTA), Browser Extensible Data format (BED), 
	      Gene Transfer Format (GTF), or a statistical summary of the data
	      in the query.
	       
	       Displaying all fields in a table
 
	      To display all the fields of the records in the query output
	      in tab-separated format, select the all fields from primary
	      table option.
	       
	       Displaying selected fields from one or more tables
 
	      To restrict the query output to a subset of the fields in a table, 
	      choose
	      the selected fields from primary and related tables option.
	      You will be prompted to pick the table fields to display. Click the 
	      box in front of the fields you would like to see in the query 
	      output (or
	      click the Check All button to select all the 
	      fields), then click the Get Fields button.
	       
	      To include data fields from other tables linked to the selected
	      table, choose the selected fields from primary and related 
	      tables option, then scroll down to the Linked Tables
	      section of the page. The tables listed in this section are
	      linked to the selected table by one or more common fields (typically
	      a name, accession, or ID field). Click the boxes in front of the
	      table(s) whose fields you wish to include in the query output,
	      then click the Allow Selection From Checked Tables. 
	      The fields of the selected tables will be displayed in the top
	      portion of the page. Click the boxes in front of the fields
	      that you wish to include in the query output, then click the 
	      Get Fields button underneath any of the field
	      lists to generate tab-separated output that includes data from
	      all the selected fields. Note that the Get Fields
	      and Cancel buttons apply globally to all the 
	      selected tables, but the Check All and 
	      Clear All buttons apply only to the fields listed
	      directly above the buttons.
	       
	       Displaying sequence (FASTA) data (positional tables only)
 
	      To display the genomic sequence underlying the query results, 
	      select the sequence option in the output 
	      format list. The Table Browser will present you with 
	      several options to configure the output display. When you have
	      completed the configuration, click the Get Sequence
	      button.
	      When displaying sequence data for gene prediction tracks, you
	      will also be offered the option to view the protein and mRNA sequence 
	      as extracted from the data source in addition to the genomic sequence.
	       
	   Displaying CDS FASTA alignments (genePred tables only)
 
          The CDS FASTA alignments are created from a Multiple Alignment File
          (MAF) in
          combination with a 
          genePred table.  
          The UCSC MAF format stores 
          multiple alignments at the DNA level between entire genomes. You can
          use the Table Browser to return FASTA alignments of coding regions in
          nucleotide-space or translated into
          amino acid-space.  However, it is worth noting that the initial
          MAF files are all created by aligning genomes at the DNA level.
           
	  Genome-wide CDS FASTA alignmentsNote that when using
          the Table Browser to fetch CDS FASTA output, it is best to 
          restrict your query to a reasonable-sized position range rather than 
          requesting output from the entire genome.  A genome-wide
          query will take a substantial amount of compute time, and it is
          likely that your Internet browser will time out and disconnect.
          If you would like to download genome-wide CDS FASTA output 
          for any of several model organisms, you can do so from the 
          download server.
 
          Creating CDS FASTA alignments using the Table BrowserTo display FASTA multiple alignments for the CDS regions of genes,
          select the CDS FASTA alignment from multiple alignment 
          option in the output format list. In order to see
          this output format option, you must have a genePred table 
          selected. 
          If you limit your search to a certain position range within the
          genome (rather than searching the entire genome), the tool
          will return FASTA alignments for all genes that overlap
          the position for which you are searching. 
          The Table Browser will present you with a configuration page.
          On this page, you can select options for your ouput.
 
          First, select your MAF table.  This is the table from which 
          the multiple alignments will be extracted for the CDS regions
          of your gene track. If you do not know the name of the MAF table
          that corresponds to the Conservation track, you can find it in the
          Genome Browser by following these 
          instructions. 
           
          Then select any of the following choices:
	     
	    Finally, from the list of species, select those that you would
         like included in the FASTA multiple alignment output.  Press
         the "get output" button to view the output.Separate into exons -
	    The default behavior is for the coding exons of each gene to be
            concatenated into
	    one sequence in the output FASTA multiple alignment. In this
	    case each output row header has the format listed
	    below under "Whole gene format".
	    If the separate into exons option is chosen then each exon will be
	    listed with a separate header in the format
	    listed below under "Exon format".
Show nucleotides -
	    The default behavior is for the nucleotides in the alignment
            to be translated
	    into amino acids according to the strand and exon frames
	    defined in the selected genePred table.
            If this option is chosen, then the nucleotides
	    in the alignment will not be translated into amino
	    acids.        
            Output lines with just dashes - 
	    The default behavior is for the alignment rows that contain
            only dashes to not be printed.  If this option is chosen, then
	    these dashes-only rows are printed.
            Format output as table - 
            If this option is chosen, the header and sequence for each organism
            will appear on the same line.
              
              Truncate headers as __ characters (enter zero for
              no headers) -
              This option works in conjunction with the "Format output
              as table" option. 
              If you want to see only a portion of the headers, choose this
              option, and enter the number of characters at which you would 
              like the headers truncated.
               
            Explanation of CDS FASTA header format— Whole gene format:
            geneName_assemblyName peptideLength location
 — Exon format:
	    geneName_assemblyName_exonNum_totalExons exonLength inFrame 
            outFrame location
 
 Following are the descriptions for each field name:
 
	      geneName- the name field from the genePred table.
	      assemblyName- the UCSC assembly 
              name
              for the species.
	      peptideLength- the length of the entire coding region.
	      If the "Show nucleotides" option is chosen, this will 
              be in nucleotides, otherwise it will be the number of amino 
              acids in the peptide.
	      location- this is the chromosome position within the 
	      assembly that is aligned in the multiple alignment.  The format
	      of this string is chrom:start-end followed by the strand
	      where the alignment occurs.  If more than one region is
	      aligned then all the regions are listed with a semi-colon (;)
	      between each position.  This address is in genome browser
	      coordinates (i.e. the start address is 
              one-based).
	      exonNum- the ordinal of the exon.  Exons are
	      counted starting at one and begin at the transcription start site
	      and progress along the strand of transcription.
	      totalExons- the number of coding exons in the gene.
	      exonLength- the length of the current exon.
	      If the "Show nucleotides" option is chosen, this will 
              be the 
	      number of nucleotides in the exon, otherwise it will be 
	      the number of amino acids in 
	      the exon (with amino acids translated from split codons 
	      placed in the exon where two of the three nucleotides lie).
	      inFrame- the frame number of the first nucleotide
	      in the exon. Frame numbers can be 0, 1, or 2 depending on
	      what position that nucleotide takes in the codon which
	      contains it.
	      outFrame- the frame number of the nucleotide
	      after the last nucleotide in this exon.
	      Frame numbers can be 0, 1, or 2 depending on
	      what position that nucleotide takes in the codon which
	      contains it.
	       
         Explanation of CDS FASTA sequence formatAs noted above, the CDS FASTA output files can be in either DNA-space 
         or protein-space.
 
         In some instances, there is a dash ("–") in the 
         sequence portion of the CDS FASTA file. Dashes are used in several 
         circumstances.  They indicate missing sequence for the aligning 
         genome, as well as deletions in the aligning genome or insertions
         in the base genome.
          
         Because the CDS FASTA alignments are based on one reference
         genome, any amino acids or nucleotides that are not
         in the reference genome are not displayed.  Consequently the peptides
         shown for aligning genomes are not necessarily the peptide that
         the other organisms' gene would generate.  Any sequence inserted
         in an aligning genome or deleted in the base genome will not be 
         present in the alignment.  We represent this condition
         with an orange bar in the Genome Browser display, but
         the CDS FASTA alignments silently ignore this issue. 
          
         — Nucleotide CDS FASTA sequence:Consider the example below that shows the FASTA sequence for four
         species aligned with the first exon of the human gene
         PLEKHO1 (UCSC Gene: uc001ett.1). Note that the rat (rn4) row 
         is missing 
         the first three nucleotides.  This could be due to a 
         lineage-specific insertion between the rat and  
         human genomes, or a lineage-specific deletion between the
         human and rat genomes. Note also that the Zebrafish 
         (danRer4) row contains only dashes.  This could be due to 
         excessive evolutionary distance between the zebrafish and human, 
         missing data in the zebrafish, or independent indels in the region in
         both species. Sometimes it is helpful to view the  
         Conservation track in the Genome Browser in this area to
         clarify the exact meaning of the dashes.
 
>uc001ett.1_hg18_1_6 30 0 0 chr1:148389072-148389101+
ATGATGAAGAAGAACAATTCCGCCAAGCGG
>uc001ett.1_panTro2_1_6 30 0 0 chr1:129156502-129156531+
ATGATGAAGAAGAACAATTCCGCCAAGCGG
>uc001ett.1_rn4_1_6 30 0 0 chr2:190795892-190795918-
---ATGAAGAAGAGCGGCTCCGGCAAGCGG
>uc001ett.1_danRer4_1_6 30 0 0
------------------------------
>uc001ett.1_oryLat2_1_6 30 0 0 chr11:3404940-3404969-
AGGATGAAGAAAAGCAACCAGAGCAGGCGG
— Amino Acid CDS FASTA sequence: 
 
          Codons that have a dash in any of the three nucleotides are 
          represented by a dash in the amino acid.  
          Codons with an N in any position are represented with an X.  
          Stop codons are represented with a Z.  
          All other amino acids follow the IUPAC amino acid codes.
          In exon format, when the codon triplet is split between two 
          exons, the amino acid will be displayed as part of the exon
          containing two of the three nucleotides like so:
          
            |exon1|  |exon2|
nucleotide: AAACCCT  TTGGGAAA
protein:     K  P    F  G  K 
Saving query results in GTF or BED format (positional tables only)
 
	      To format the query results using 
	      GTF or 
	      BED conventions,
	      select the corresponding option in the output 
	      format list. Note that when you select GTF, the table browser
	      translates the output into this format. For tables that lack feature
	      designations, all records are arbitrarily assigned the feature "exon" to
	      conform to GTF specifications.  
	      If you select BED format, you will be
	      presented with the option to include and configure a custom track
	      header and options for organizing the data. When you have finished
	      the configuration -- or to accept the default options -- click
	      the Get BED button at the bottom of the window.
	        
	       Saving data to a file
 
	      By default, the Table Browser displays query results directly in
	      your internet browser window. To redirect the data to a file,
	      type a file name into the output file box before
	      starting the query. The Table Browser will prompt you for the
	      location of this file on your local disk while processing the
	      query.
	       
	       Saving data as a custom track (positional tables only)
 
	      Query output may be saved in a format that can be displayed as
	      a custom annotation track in the Genome Browser. Custom tracks
	      created during a Table Browser session may also be used for 
	      subsequent queries and intersections in the same session. For 
	      more information on custom tracks, see the Genome Browser
	      User's Guide.
	       
	      To save query data in custom track format, select the custom
  	      track option in the output format list. When
	      the query is executed, the Table Browser will prompt you to
	      customize the track header and configure the record layout of
	      the data. The configuration is optional; the Table Browser
	      automatically sets up a default track configuration. Click the 
	      Custom track link for more information on custom track 
	      syntax and format.
	       
	      When you have finished configuring the custom track -- or to accept
	      the default configuration -- click one of the buttons at the bottom
	      of the window to create the custom annotation track. 
	       
	      
	      To display the query results as text on the screen, click the 
	      Get Custom Track File button. 
	      
	      To save the query results to a file on your local disk for future 
	      use, specify a file name in the output file box
	      before executing the query, then click the Get Custom Track 
	      File button. 
	      
	      To load the query results into a table accessible from the 
	      Table Browser table list, click the 
	      Get Custom Track in Table Browser button. 
	      
	      To view the query results as a custom track in the Genome 
	      Browser, click the Get Custom Track in Genome 
	      Browser button. Your 
	      browser display will be redirected automatically to the Genome 
	      Browser, with your custom track positioned near the top of the 
	      annotation tracks window. 
	      
	      To access your custom track data in a subsequent query in the same
	      Table Browser session, select the Custom Tracks option
	      from the group list to display the custom tracks
	      available. 
	       
	       Displaying query results as Genome Browser hyperlinks (positional tables only)
 
	      To examine the records in the query output individually in the
	      Genome Browser, select the hyperlinks to Genome Browser
	      output option. The Table Browser will display a list of one or
	      more hyperlinks corresponding to the individual records in the 
	      output data. Click a link to open up the Genome Browser display
	      to the item and position shown on the hyperlink.
	       
	       Displaying a statistical summary
	      of query data (positional tables only)
 
	      To generate a statistical summary of the query output data, the
	      region covered by the query, and the CPU time required to process 
	      the query, click the Summary/Statistics button.
	       |  | 
 
 |  |  
   |