|
Linking experimental results,
biological networks and sequence analysis methods using Ontologies and
Generalised Data Structures
Abstract
The structure of a closely integrated data
warehouse is described that is designed to link different types and varying
numbers of biological networks, sequence analysis methods and experimental
results such as those coming from microarrays. The data schema is inspired by a
combination of graph based methods and generalised data structures and makes use
of ontologies and meta-data. The core idea is to consider and store biological
networks as graphs, and to use generalised data structures (GDS) for the storage
of further relevant information. This is possible because many biological
networks can be stored as graphs: protein interactions, signal transduction
networks, metabolic pathways, gene regulatory networks etc. Nodes in biological graphs represent entities
such as promoters, proteins, genes and transcripts whereas the edges of such
graphs specify how the nodes are related. The semantics of the nodes and edges
are defined using ontologies of node and relation types. Besides generic
attributes that most biological entities possess (name, attribute description),
further information is stored using generalised data structures. By directly
linking to underlying sequences (exons, introns, promoters, amino acid
sequences) in a systematic way, close interoperability to sequence analysis
methods can be achieved. This approach allows us to store, query and update a
wide variety of biological information in a way that is semantically compact
without requiring changes at the database schema level when new kinds of
biological information is added.
We describe how this datawarehouse is being
implemented by extending the text-mining framework ONDEX to link, support and
complement different bioinformatics applications and research activities such as
microarray analysis, sequence analysis and modelling/simulation of biological
systems. The system is developed under the GPL license and can be downloaded
from
http://sourceforge.net/projects/ondex/
Keywords: graph database, ontology,
Generalised Data Structures, semantic data integration
For full article
click
here |