Naming bacterial species
and 16S rRNA gene sequences for type strains
Naming bacteria
Back to Identification of micro-organisms
See also the following articles :
- When
does a clone deserve a name
- Should
names reflect the evolution of bacterial species
- Defining
bacterial species.
- Fuzzy
species among recombinogenic bacteria.
- The mosaic
structure of bacterial genomes.
- Alphabetical lists of validly described, not (yet) validly described bacterial species.
- Species validly described in IJESM, with links to GenBank entries and pdf & abstracts a full text (restricted access): Vol56 Vol55
- Species validly described in IJESM, with links to GenBank entries and pdf & abstracts a full text: Vol54 Vol53
- Species validly described in IJESM, with links to GenBank entries and pdf & abstracts: Vol52 Vol51 Vol50 Vol49 Vol48 Vol47 Vol46 Vol45
- species described with abstract only (no sequence): Vol44 Vol43 Vol42 Vol41 Vol40
- The accession numbers of
sequences for new type strains are not always explicitely given, in
older issues there is no number at all; some descriptons are done
without sequencing of 16S rRNA gene sequence.
- In the most recent issues accession numbers of new strains seem to be always explicitely described in abstract.
- When they are not, this list being build by parsing the pdf files with regex;. it may contain sequences that are not reference strains.
- Also some descriptions do not contain any 16S rRNA gene sequence.
- Finally, some accessions numbers are wrong or mis-spelled as in the following example:
- AFO56710 instead of AF056710 (a letter O in place of number 0): IJSEM 48 (4): 1095.
- AF0033672, one 0 too many !: IJSEM 48 (3): 821.
- J271157 instead of AJ271157 (first letter misssing) Aguilera et al. 51 (5): 1687.
- AJ23842 and AJ23841 instead of AJ238042 and AJ238041 in IJSEM Schlesner et al. 51 (2): 425.
- Please send me an e-mail if you see any error, I will improve the regex to fix that.
- Watch for updates, as my parsing gets better !
- In preparation : 16S rRNA sequences of validly described reference strains (let me know if want my preliminary files).
- regex of ijsem files
- acnuc check for bacteria & 16S rRNA
- db check
Introduction usefull web sites Candidatus Precisions by Euzeby 16S sequences
Introduction.
With the use of molecular methods, the identification of
micro-organisms (bacteria, archea, virus, fungi and all kinds of
protists) has seen a revolution, in particular with the help of
phylogenetic analyses of ubiquitous genes (often the SSU rRNA gene
sequences).
This revolution was particularly helpful for Bacteria (and Archea),
since an international body checks and publish valid names, ensuring
that the same organisms is not published twice under different names
(an awful situation often encountered for fungi !) and that the name's
spelling is correct (according to the "rules).
Nevertheless, the
situation is not always clear :
- Numerous species have changed names because of more
accurate molecular methods (as opposed to phenotypes).
- Old names are still present in old publications (and old
EMBL/GenBank entries).
- Some recent papers (not written by "taxonomists") still use
an old name.
- It is not always easy to find out the "true" sequence from
the name (particularly the sequence of the Type species).
- For a single species, one may find a large number of
sequences: which one to choose ?
- The name of the strain is not always given in the EMBL
entry
- see this example (one among others).
- Some sequences are truly bad (lots of sequencing errors).
- Some sequences are very short, or the "wrong "strand has
been submitted.
- Even worse, it is quite frequent to find very bad
identifications :
- A contaminating species has been sequenced (no checking
by phylogenetic analysis before submission).
- The description says it is a 16S sequence, but it is
not !
- Taxonomy above the genus rank provided by the OC field
(EMBL entry) is often approximative.
What should be known
:
- For a species to be validly described (in fact there is no "valid
species", only "species validly described, thanks to J.
Euzeby for this precision):
- The strain should be kept in culture (with some
exceptions
as for the intracellular pathogen Mycobacterium
leprae)
- The strain should be well characterized by its phenotype.
- This phenotype should bear at least one significative
difference with respect to a previously described species.
- The latin name should "pass the rules".
- This name was approved by the ad hoc committee : International
Code of Nomenclature of
Bacteria (Lapage et al., 1992).
- See
below some precisions
by J Euzeby.
- For each species, one "Type species" is described (often
the first species described for this genus).
- For the other strains of the species, they display a
similarity at the genomic level, as assessed by the curves of
naturation-denaturation when their two genomes are mixed.
- This result may vary depending upon the method, the
quality of extraction ...
- Complete sequencing of several strains for the same
species have demonstrated that bacterial genomes are in fact of mosaic
nature. They contain conserved parts present in every strain
(the skeleton) and parts that are present only in some strains. When
such divergent parts are large, one may obtain low melting temperatures
between different strains of the same species.
- In some cases, sub-species are described, when the
similarity at the genomic level is close or slightly below the critic
level, or the phenotype very different (but phylogenetic analyses
suggest the same genus).
- In one case, there are sub-genera (Moraxella).
- For Salmonella,
there is a double naming that may cause problems for-non specialists.
- Some species (i.e. Shigella)
are wrongly described. They are true Escherichia coli strains,
but their original name is kept not to disturb physicians (their
pathogenic phenotype is often due to the presence of a plasmid).
- For each genus a Type species of the genus is described (usually the first species isolated for this genus).
- Naming a new species (or re-naming) should obey three rules
(but see below Euzeby's precisions):
- A publication i.e. a scientific article that
describes the strain ;
- Legitimacy (a complete
description that follows the rules) ;
- The same name has not been previously used (priority rule).
Since January 1980,
priority is assessed according to the "APPROVED
LISTS OF BACTERIAL NAMES" (Skerman et al., 1980). Presently this list
contains about 2 000 described species (see Euzeby site and also a
condensed
list). However, not every culture collection follows strictly the
rule (see this
list).
My own experience after ordering hundreds of type strain from a major collection (ATCC) showed that
a few percent of their strains are in fact contaminants. Do sequence the 16S rRNA gene and use a
phylogenetic analysis to make sure you got the proper species before running serious work on it.
Names which are not in
this list are not validly approved names, however these strains may be
in collections (Pasteur, DSMZ, ATCC, ...) and be widely used in
important industrial applications. Simply, nobody took the burden to
describe them approprietly !
Updates of this list are published in the
"INTERNATIONAL JOURNAL OF SYSTEMATIC AND
EVOLUTIONARY MICROBIOLOGY" (
IJSEM),
by validation of descriptions published in the same journal
or elsewhere.
Jean
Euzeby
extracts from these lists all new names, combinations or
modifications and publishes everything as a web site. This is an
enormous work that allows every microbiologist to easily know the
current names without having to go to IJSEM and find out in which issue
is the final change !
Alternatively you may go to the
DSMZ
or the "
collection
Pasteur" and of course
ATCC.
BUT : you
should
know that their web pages are much less precise than the work of J
Euzeby (not up to date, many errors...).
I have collected the abstracts of IJSEM and extracted titles and accession numbers (see top of this page).
In preparation : a page with direct access to the 16S rRNA gene sequence for every validly published Bacteria.
Non
valid names you be spelled with "" (i.e. "Genus name") but this rule is
very rarely in effect aside from bacterial taxonomy publications.