Index of /genome/db99/Mpne/documentation

      Name                    Last modified       Size  Description

[DIR] Parent Directory 31-Jan-2000 15:49 - [TXT] README 09-Aug-1999 18:38 13k [TXT] Source 09-Aug-1999 18:38 1k [TXT] seq.hdr 09-Aug-1999 18:38 1k

#############################################################

README from ftp://ncbi.nlm.nih.gov/genbank/genomes/   

updated: NOV-7-98
#############################################################

This experimental directory includes some of the complete genomes, 
chromosomes and large (> 350 kb) sequences present in 
the Entrez genomes division (http://www.ncbi.nlm.nih.gov/Entrez/)

The files are present in the following formats:
NEW:
*.ffn = FASTA nucleotide coding regions file
NOTE: this file is now available for the M.genitalium updated
genome and new sequence of P.falciparum chromosome 2. IT will be added
to the other genomes as well.

*.asn = ASN.1 file, print form, replaces .prt
*.faa = FASTA Amino Acid file 
*.fna = FASTA Nucleic Acid file
*.gbk = GenBank flat file format
*.gbs = GenBank summary file format
*.ptt = ProTein Table
*.tab = Table to assemble genome
*.val = ASN.1 binary format
*.tar.Z  = unix tar and compressed files (not all files are compressed),
           where all of the above files are present.  If you want all
           files for a group, all you need is to FTP this one file.

Presently these completed genomes are organized in the
following directories:

>> bacteria/Aful/

   aful.* = Archaeoglobus fulgidus complete genome

>> bacteria/Aquae/
 
   aquae.* = Aquifex aeolicus complete genome

>> bacteria/Bbur/

   bbur.* = Borrelia burgdorferi complete genome
   bbur.plasmids.* = Borrelia burgdorferi,  
                     11 plasmids, complete sequence. 

>> bacteria/Bsub/

   bsub.* = Bacillus subtilis complete genome 

>> bacteria/Ctra/

   ctra.* = Chlamydia trachomatis complete genome 

>> bacteria/Ecoli/

   ecoli.* = Escherichia coli complete genome **

   ** NOTE: there are two FASTA files for the E. coli genome,
      and these have been moved to bacteria/Ecoli/OLD:

   1) ecoli.fsa: From Fred Blattner and colleagues.
   2) ecoli.japan.fsa: From the E. coli database group
      (see ftp://ncbi.nlm.nih.gov/genbank/genomes/ecoli.japan.README)

>> bacteria/Hinf/

   hinf.* = Haemophilus influenzae Rd complete genome 

>> bacteria/Hpyl/

   hpyl.* = Helicobacter pylori complete genome

   hpyl.134-gbff.Z = the 134 GenBank Flat Files which are
                     presently in GenBank.

>> bacteria/Mgen/

   mgen.* = Mycoplasma genitalium complete genome 

>> bacteria/Mjan/

   mjan.* = Methanococcus jannaschii complete genome 

>> bacteria/Mpneu/

   mpneu.* = Mycoplasma pneumoniae complete genome

>> bacteria/Mthe/

   mthe.* = Methanobacterium thermoautotrophicum complete genome.

>> bacteria/Mtub/

   mtub.* = Mycobacterium tuberculosis H37Rv complete genome.

>> bacteria/Pyro/

   pyro.* = Pyrococcus horikoshii complete genome

>> bacteria/R_pNGR/

   pNGR234.* = Rhizobium sp. NGR234 complete plasmid sequence

>> bacteria/Synecho/

   synecho.* = Synechocystis PCC6803 complete genome

>> bacteria/Tpal/

   tpal.* = Treponema pallidum complete genome

>> C_elegans/CHR_[I-V,X]

   worm_[I-V,X].* = Caenorhabditis elegans genome sequence

   PLEASE NOTE: it is an attempt to generate complete sequences for all the 
   chromosomes. This is a work in progress, the gaps are represented by 'N'
   in *.fna files; *.gbk files contain only FEATURE TABLE information.

>> S_cerevisiae/Chr[01/16]

   yst_chr_[01/16].* = Saccharomyces cerevisiae complete genome

   PLEASE NOTE: we are presently experiencing a problem with yeast 
   chromosome XII (12) and we hope to remedy the yst_chr_12.gbk 
   file shortly. It is an assembly problem, not one caused by the 
   component GenBank flatfile (which are all OK, and all present)

>> P_falciparum/CHR_II

   pfal_2.* = Plasmodium falciparum chromosome 2, complete sequence

--------------------------------------------------------------

Not all formats are there for all large sequences in the 
genomes division, but it is our goal to do so.  
Please let us know if another format you may want is not there, 
we may be able to produce it.

--------------------------------------------------------------

If you have any question about this documents or files within
this directory, please do not hesitate to contact us:

National Center for Biotechnology Information 
Building 38A, Rm 8N-803
National Library of Medicine, 
National Institutes of Health
Bethesda, MD, 20894, USA
 
telephone: (301) 496-2475
fax:       (301) 480-9241
e-mail:    info@ncbi.nlm.nih.gov


----------------------------------------------------------------------

 DOCUMENT AND DIRECTORY REVISION HISTORY:
 
----------------------------------------------------------------------
 Date    | Change
======================================================================
 96-04-09| Installed this README file along with the various
         | versions of Haemophilus influenzae and Mycoplasma
         | genitalium genomes. 
----------------------------------------------------------------------
 96-06-13| Changed the directory structure for the FTP
         | directory for the genomes division.
----------------------------------------------------------------------
 96-06-16| Added binary and printform versions of the ASN.1 for the 
         | Haemophilus influenzae and Mycoplasma genitalium genomes. 
----------------------------------------------------------------------
 96-08-22| Added Methanococcus jannaschii complete genome files.
----------------------------------------------------------------------
 96-09-10| Added Saccharomyces cerevisiae complete genome files.
         | (still missing Chromosomes I and XVI)
----------------------------------------------------------------------
 96-09-15| Added Saccharomyces cerevisiae complete genome files:
         | Chromosomes I
----------------------------------------------------------------------
 96-09-18| Added Saccharomyces cerevisiae complete genome files:
         | Chromosomes XVI, and updated all S_cerevisiae *.gbs files.
----------------------------------------------------------------------
 96-10-11| Refreshed the Haemophilus influenzae and
         | Methanococcus jannaschii complete genome files.
----------------------------------------------------------------------
 96-11-04| Added Synechocystis PCC6803 complete genome files.
----------------------------------------------------------------------
 96-11-22| Added Mycoplasma pneumoniae complete genome files.
----------------------------------------------------------------------
 96-12-17| Updated all Saccharomyces cerevisiae complete genome files.
----------------------------------------------------------------------
 97-01-25| Added Escherichia coli complete genome files. 
----------------------------------------------------------------------
 97-05-23| Added Rhizobium sp. NGR234 complete plasmid sequence.
         | all supported formats.
----------------------------------------------------------------------
 97-08-07| Added Helicobacter pylori complete genome files.
----------------------------------------------------------------------
 97-08-14| Re-organized directory structures for all genomes
         | on FTP directory.
----------------------------------------------------------------------
 97-09-04| Installed Escherichia coli update, as a single GenBank
         | flat file on FTP directory. More later!
----------------------------------------------------------------------
 97-09-10| Installed complete set of Escherichia coli
         | sequence files.
----------------------------------------------------------------------
 97-09-12| Re-Installed complete set of Escherichia coli
         | sequence files.
----------------------------------------------------------------------
 97-11-17| Installed Methanobacterium thermoautotrophicum
         | complete genome sequence files.
----------------------------------------------------------------------
 97-11-24| Installed Bacillus subtilis complete genome sequence files.
----------------------------------------------------------------------
 97-12-02| Installed Archaeoglobus fulgidus complete genome 
         | sequence files.
----------------------------------------------------------------------
 97-12-20| Installed updates of Archaeoglobus fulgidus complete genome 
         | sequence files.
----------------------------------------------------------------------
 97-12-20| Installed Borrelia burgdorferi complete genome and
         | plasmid sequence files.
----------------------------------------------------------------------
 97-12-24| Installed updates of Rhizobium sp. NGR234 complete plasmid
         | sequence files.
----------------------------------------------------------------------
 98-01-26| Reinstalled all bacterial genomes with new file formats and
         | tables.
----------------------------------------------------------------------
 98-01-30| Reinstalled Methanococcus jannaschii and Haemophilus 
         | influenzae Rd complete genomes. Because of discrepencies
         | that where pointed out to us we regenerated the complete
         | genome files we keep on our FTP directory.  These are now 
         | in agreement with the public data that is present on the 
         | TIGR FTP site.
----------------------------------------------------------------------
 98-02-06| Reinstalled Borrelia burgdorferi complete genome to correct
         | an error in the topology of the genome.
----------------------------------------------------------------------
 98-03-25| Installed Aquifex aeolicus complete genome sequence files.
----------------------------------------------------------------------
 98-03-27| Reinstalled Aquifex aeolicus complete genome sequence files 
         | added gene symbols aq_* to the prottable (aquae.ptt)
----------------------------------------------------------------------
 98-05-21| Installed Caenorhabditis elegans genome sequence files
         | for all six chromosomes
----------------------------------------------------------------------
 98-06-15| Reinstalled Haemophilus influenzae Rd complete genome. 
----------------------------------------------------------------------
 98-06-22| Installed Pyrococcus horikoshii complete genome sequence 
	 | files
----------------------------------------------------------------------
 98-06-28| Reinstalled Borrelia burgdorferi plasmids files which 
         | contain updates to cp9 and cp26
----------------------------------------------------------------------
 98-07-08| Installed Mycobacterium tuberculosis H37Rv complete genome 
	 | sequence files
----------------------------------------------------------------------
 98-07-16| Installed Treponema pallidum complete genome 
	 | sequence files
----------------------------------------------------------------------
 98-07-20| Installed Chlamydia trachomatis complete genome 
	 | sequence files
----------------------------------------------------------------------
 98-09-04| Reinstalled Chlamydia trachomatis complete genome 
	 | sequence files (17 genes added)
----------------------------------------------------------------------
 98-10-22| Reinstalled Pyrococcus horikoshii complete genome 
	 | sequence files (AP000001-AP000007 updated)
----------------------------------------------------------------------
 98-10-23| Reinstalled Treponema pallidum complete genome 
	 | sequence files (annotation fix: GeneMark/Glimmer)
----------------------------------------------------------------------
 98-10-23| Reinstalled Borrelia burgdorferi complete genome 
	 | sequence files (annotation fix: GeneMark/Glimmer)
----------------------------------------------------------------------
 98-11-06| Installed Plasmodium falciparum chromosome 2, complete 
	 | sequence files
----------------------------------------------------------------------
 98-11-07| Installed updated Mycoplasma genitalium, complete genome
         | sequence files
----------------------------------------------------------------------
 98-11-10| Reinstalled Saccharomyces cerevisiae 16 chromosomes
         | sequence files
----------------------------------------------------------------------
 98-11-10| *.ffn = FASTA nucleotide coding regions file added to
         | Aquifex aeolicus, Archaeoglobus fulgidus, Borrelia
         | burgdorferi, Bacillus subtilis
----------------------------------------------------------------------
 98-11-11| Updated *.ffn for Plasmodium falciparum chromosome 2 
   	 | (corrections of intron/exon structure of coding regions)
----------------------------------------------------------------------
 98-11-11| *.ffn = FASTA nucleotide coding regions file added to
         | Chlamydia trachomatis, Haemophilus influenzae,
         | Helicobacter pylori, Methanococcus jannaschii,
         | Mycoplasma pneumoniae, Methanobacterium thermoautotrophicum,
         | Mycobacterium tuberculosis H37Rv, Pyrococcus horikoshii,
         | Rhizobium sp. NGR234, Synechocystis PCC6803,
         | Treponema pallidum
----------------------------------------------------------------------
 98-11-11| Reinstalled C.elegans 6 chromosomes (new sequences from WashU)
         | 
----------------------------------------------------------------------
 98-11-14| Installed Rickettsia prowazekii sequence files
----------------------------------------------------------------------
 98-11-18| Reinstalled E.coli sequence files
----------------------------------------------------------------------
 98-12-10| Reinstalled Mycobacterium tuberculosis sequence files
 		 | mtub.ffn added
----------------------------------------------------------------------