A Microbial Ontology.
1. GENERAL INTRODUCTION
WHAT IS IN AN ONTOLOGY?
An ontology defines a common vocabulary for researchers who need to share information in a domain. It includes machine-interpretable definitions of basic concepts in the domain and relations among them.
An ontology is a formal explicit description of concepts in a domain
(classes), properties of each concept describing various features and
attributes of the concept (slots or properties), and restrictions on
slots (facets).
An ontology together with a set of individual instances of classes constitutes a knowledge base (1,2,3).
In recent years the development of ontologies has been moving from the
Artificial-Intelligence laboratories to numerous domains. Many
disciplines now develop standardized ontologies that domain experts can
use to share and annotate information in their fields. The WWW
Consortium (W3C) has developed the Resource Description Framework, a
language for encoding knowledge and make it understandable to
electronic agents searching for information, and is now working on OWL
(www.w3.org/2001/sw/WebOnt/; see also present oil :
http://oiled.man.ac.uk/building/).
WHY DEVELOP AN ONTOLOGY?
- To share common understanding of the structure of information among people or software agents.
- To enable reuse of domain knowledge.
- To make domain assumptions explicit.
- To separate domain knowledge from the operational knowledge.
- To analyze domain knowledge.
An ontology of a domain is not a goal in itself. Developing an ontology
is akin to defining a set of data and their structure for other
programs to use. Problem-solving methods, domain-independent
applications, and software agents use ontologies and knowledge bases
built from ontologies as data.
ONTOLOGIES IN BIOLOGY.
See a full list of ontologies and description on : OBOIn biology, there are currently several available ontologies among which one can cites:
- Gene Ontology TM
(GO). Developed by the Gene Ontology Consortium to help annotate
information on gene products (not the genes) using the following three
organizing principles of molecular function, biological process and
cellular component (4). This was the first and is currently the core
ontology in genomics. GO is now being used for example to analyze
rapidly data obtained with DNA chips having probes for the entire
genome of an organism. (references).
- Trait Ontology TM
(TO). It represents a controled vocabulary to describe each trait as a
distinguishable feature, characteristic, quality or phenotypic feature
of a developing or mature individual. Examples are glutinous endosperm,
disease resistance, dwarf, photosensitivity, male sterility, etc. The
present trait ontology is rice specific.
- Plant Ontology TM
(PO). Gramene is collaborating with The Maize Mapping Project
(MaizeDB), The Arabidopsis Information Resource (TAIR), and the
International Rice Research Institute (IRRI) as part of the Plant
OntologyTM Consortium (POC) to develop a controled vocabulary for plant
anatomy and growth stages (5,6).
A BRIEF DESCRIPTION OF GO.
This ontology is currently organized as a parent/child description
(Figure 1); A parent can have several children, each child inherits all
of its parent's properties but has more specialized properties of its
own.
Figure 1 : The Parent to Child relationship.
A child can have several parents, that are usually located in the
different domains of the ontology (GO has three domains), but not
necessarily. Figure 2 below shows how the term "Pheromone processing"
has multiple successive parents, all located within the domain
"Biological Processes"
Figure 2. Links to several parents contains the knowledge attached to a term of the ontology.
In genomics, ontologies have now become a de facto standard as a
controled vocabulary for annotating the functions, pertinent processes
and cellular locations of gene products (7). The initial aim was to
provide a concise standardised set of terms agreed on by the biological
community which would assist biologists in manually comparing
properties among data sets. However biologists are now on their way to
use gene ontologies to support much more ambitious goals where manual
analysis is replaced by automatic reasoning by bioinformatics
applications (a new form of data-mining).
REFERENCES.
- Noy, N.F., and McGuinness, D.L. Ontology Development 101: A Guide
to Creating Your First Ontology Stanford University, Stanford, CA,
94305. http://protege.stanford.edu/publications/ontology_development/ontology101-noy-mcguinness.html.
- Creating the gene ontology resource: design and implementation.
Genome Research, 2001, Vol 11, 1425-1433.
http://www.genome.org/cgi/content/full/11/8/1425
- Bard, J. Ontologies: Formalising biological knowledge for bioinformatics. Bioessays 25: 501-506 (2003)
- Gene Ontology: tool for the unification of biology. Nature
Genetics, 2000, 25: 25-29.
http://www.nature.com/cgi-taf/DynaPage.taf?file=/ng/journal/v25/n1/full/ng0500_25.html
- The Plant OntologyTM Consortium and Plant Ontologies. comparative
and Functional Genomics, 2002, Vol 3: 137-142.
http://www3.interscience.wiley.com/cgi-bin/fulltext/91016119/FILE?TPL=ftx_start
- Gramene: development and integration of trait and gene ontologies
for rice. Comparative and Functional Genomics, 2002, Vol 3/
132-136.
http://www3.interscience.wiley.com/cgi-bin/fulltext/91016047/FILE?TPL=ftx_start
- Doniger, S.W., Salomonis, N., Dahlquist, K.D., Vranizan, K.,
Lawlor, S.C., and Bruce R Conklin. MAPPFinder: using Gene Ontology and
GenMAPP to create a global gene-expression profile from microarray data
. Genome Biology 2003 4:R7
- See also the most interesting opinion in Nature (2001) 413 :1-3