GenBank/RefSeq Alignment Step
    This step is done for each browser database (species and assembly) that is
    being updated. 
    Algorithm
    
      - Select the most current GenBank (and corresponding RefSeq) full
      releases from the processed/directory.
- 
        If there is not a corresponding aligned/.../fulldirectory for the release:
          - If there is a previous aligned release for this database, find
          the aligned sequences that have not changed since that release and
          the new, full release. Save these in a a temporary directory.
- Copy new and changed sequences to temporary fasta files for
          alignment.
 
- 
        For each processed/.../daily.${ver}/directory that does not have a completedalignment/directory:
          - Copy new and changed sequences to temporary fasta files for
          alignment.
 
- If the number of sequences requiring alignment exceeds some
      configured threshold, send e-mail requesting an alignment on the big
      cluster and stop the automated process. Normally, this should only be
      required when a new database is built.
- If the number of sequences is below the threshold, run BLAT on the
      mini-cluster, using the parasol makefacility.
- 
        Process the completed alignments: 
        
          - Combine alignments migrated from the previous releases if
          pending
- Building an index file
- Checksum the files.
- Do some sanity check on the alignment. Check that changed
          sequences continue to align, at least in most cases and that the
          number of aligned sequences increases.
 
      - 
        $gbRoot/data/aligned/- aligned files
          - 
            genbank.${ver}/
              - 
                ${db}/- alignments for this genome
                database (e.g. hg12).
                  - 
                    full/- Alignments corrisponding to the full
                    release. This is a combination alignments migrated from
                    previous releases and new alignments.
                      - mrna.native.psl.gz,- mrna.native.oi.gz,- mrna.native.alidx,- mrna.native.md5
- est.aa.native.psl.gz,- est.aa.native.oi.gz,- est.aa.native.alidx,- est.aa.native.md5,
- mrna.xeno.psl.gz,- mrna.xeno.oi.gz,- mrna.xeno.alidx,- mrna.xeno.md5
- est.aa.xeno.psl.gz,- est.aa.xeno.oi.gz,- est.aa.xeno.alidx,- est.aa.xeno.md5,
 
- daily.${date}/- Alignments for
                  sequence that were new or modified in the daily update.
 
 
- 
            refseq.${ver}/- BLAT alignments for RefSeq,
            same structure as used for GenBank, with only native mRNAs.
 
Index file
    Two alignment index files are always created for the corrisponding
    processed.gbidx file, for native and xeno, the if there are no sequences to
    align. This supports easy checking for the alignment being completed. The
    file is a tab-seperated in the format: 
    acc version numaligns
    The name of the file is either mrna.alidx or
    est.*.alidx and is associated with the a *.psl
    file of the same name. The columns are: 
    
      - acc- GenBank or RefSeq accession
- version- Version number, not including the
      accession
- numAligns- Count of the number of alignments for this
      accession.