GandrKB
A knowledgebase for integrative modeling and access to Microarray annotation data

Introduction
Tutorials (posters, films and further descriptions)
Installation details for the HU95Av2 GandrKB
Gandr for other Affymetrix Chips
WebGandr - online version of the HU95Av2 GandrKB
Gandr ontology in other representation formats
Protege plugin development (QueryExportTab)
Introduction
GandrKB (Gene annotation data representation) is a client-based, platform independent Protégé 2000 knowledgebase.
This means it is a frame-based rather than spreadsheet-based form of "database". Its schema is object-oriented rather than relational, which means it is focused more on the interrelation within
the contained annotation data. Since "data in relation" is what we call "knowledge", we call the system a knowledgebase.
It emphasizes the immediate accessibility of the context of the data.
The GandrKB allows integrative knowledge management-, querying- and visualization of microarray data annotations and expression values.
The underlying ontology serves as an object-oriented knowledgebase-scheme to integrate, structure and visualize expression- and NetAffx gene context data in a frame-based way and is exemplarily particularized here for analysing Toll-like Receptor signalling pathways.
ProbeSetIDs are represented as instances of annotation concepts and are visualized as frames instead of rows in tables.
Target users: Target user is the wetlab-scientist that evaluates microarray experiments. Although we don't say the underlying object-oriented philosophy is easy to understand, we believe that it can be understood to a certain level that enables the scientist to use the system effectively within a reasonable time.
The user will be well shielded from any informatics or data formats. He can install the system and immediately start investigating genes represented on the microarray. The ontology itself is self-explanatory to a certain degree and provides guidance to its usage. The comparatively intuitive graphical user interface of the Protege system further eases the use of the GandrKB system for the non-IT-Specialist.
Gene annotation: We believe besides GO, we need a way for decentralized laboratory specific gene annotation of higher and more formal semantics than that of simple text currently used in spreadsheets.
We present an approach to enable the domain experts themselves to create intuitive and yet descriptive and formal gene annotation concepts.
Users can expand the provided ontology with their own annotation concepts resulting in search attributes and classifiers that map their laboratory specific user terminology.
We provide a basic ontology of annotation-concepts and gene context data which allows for annotating genes with these provided or the users self generated ontological concepts.
The Annotation of a gene (referenced by its ProbeSetID) with a conceptual description is simply done through “drag and drop” of the ProbeSetID into the new annotating concept. The ProbeSetID then inherits all attributes of the new annotation concept and its superconcepts automatically. Because many genes can be annotated in parallel, the annotation-process itself becomes faster and is even guided through constraints provided by the core knowledge model.
Visualization: Annotated genes can be interrelated through special attributes. By interrelating concepts and instances through such complex slots, a context- or knowledge net is constructed successively. This gene-context can be automatically visualized as pathway or network.
These networked annotation models can be browsed and explored in an intuitive way with advanced interactive and graph-based visualization tools like TouchGraph, GraphViz and SHriMP / Jambalaya).
This allows access to expression-, annotation and context data in a free, intuitive and associative browser-like manner.
Querying: The annotation-concepts are structured hierarchically, allowing the system to generalise over concepts, when they are used as search-attributes.
The system enables lab-bench scientists to query for implicit knowledge, inferred from their generated ontological domain-model.
For example queries for "G_Protein_Coupled_Receptor"-annotated ProbeSetIDs would automatically include ProbeSetIDs annotated as "Adrenergic_Receptor" because the subsumption ("an adrenergic receptor is-a G-protein coupled receptor") is inferred through the ontological hierarchy.
Thus, ontological querying allows for powerful and complete access to even implicit annotations and knowledge. Queries for gene-relations are as easy to state as are queries for many gene-attributes in parallel. Such comprehensive access to data semantics through ontological querying, as well as the ability to store generated queries in a query library improves the quality and completeness of query-results and enables for a faster
and more accurate data-reduction process.
Website integration: GandrKB integrates access to the websites of GeneSymbols, GeneCards, LocusLink,
KEGG, ProwCDs and Biocarta Pathway maps in a ProbeSetID centered perspective. Current Netaffx annotations
for the ProbeSetIDs are build in as well as an exemplary microarray experiment.
Tutorials and further descriptions of the system
Here are two very good introductions to the use and creation of ontologies in general: ProtegeKnowledgebases
(english) and Wissensbasen (german). I would regard these as mandatory to understand the Gandr-System. For a brief overview of the System look at the
poster, or read a summary about GandrKB
and how it was created.
To see what kind of annotation concepts are provided within the Gandr ontology, look at the "Top-level-concepts"
picture. To see how these concepts are linked, look at the UML-Class-Diagram or the
Top-level-schema generated with OntoViz.
For a practical introduction of the diverse capabilities of the GandrKB system, download and view the tutorial films or the Short practical User Guide for GandrKB.
To get an idea of the visualisations possible with GandrKB and Protege-2000, look at some screenshots.
To actively browse through the annotation concepts Gandr consists of, look at the GandrOntology in
HTML Format.
Further information on the Gandr System, other bioontologies and ontologies in general can be found on my personal website.
Installation
For client-based use of the GandrKB with powerful query and visualisation techniques download and install
Protege-2000 Version 2.1.2, unzip the
QueryExportTab plugin into the /Protege_2.1/plugins directory, download GandrKB and
open the GandrKB.pprj file. If you open the GandrKB without prior installation of the QueryExportTab plugin, you will not be able to
explore the example query library that comes with GandrKB. If you would like to use the OntoViz-Tab you`ll have to install ATT´s free GraphViz Dot Tool first.
The OntoViz, TouchGraph or other plugins then have to be selected in the Protege/Configure menue after the GandrKB has been opened.
A machine with more than 512MB RAM, P IV, 2.8GHz and a fast internet connection is recommended. It takes about two minutes to initially load the
whole project.
Gandr for other Affymetrix Chips
You can use the Gandr ontology to annotate other mammalian Affymetrix microarrays. Install Protege as described above and download the
Gandr Version for the Human HG-U133 Plus 2, for the Mouse 430A 2 or for the Rat 230 2.
The integrated Netaffx annotation for these chips is of 10. Nov. 2004. Keep in mind that these are for your own annotations. The ProbeSetIDs
are neither linked to each other, nor are they annotated with ontological concepts. So yet they are not full knowledgebases like the GandrKB
for the HU95Av2.
WebGandr (online version of the GandrKB)
Alternatively you can browse and query the GandrKB as a webapplication
based on webprotege. Keep in mind that you have to use the
client-based GandrKB version to exploit the full advantages of the system. Own annotations, nested- and complex querying and automatic
graph-based visualizations are not possible with the web version.
Gandr ontology in other formats
The Gandr ontology has a defined standard format called CLIPS. It is possible to map knowledge-representation-ideoms between different knowledge representation languages. We converted the Gandr ontology into XMI format with the XMI-Plugin. It can now be used within standard CASE-tools such as Poseidon to create UML-Diagrams that describe ones knowledge models in a uniform and standardized way. The GandrKB UML-Class-Diagram was generated with this, for example.
You can download the XMI or the ZUML format of the Gandr ontology here for experimentation (right-click & save link target as).
Protege plugin development
In the context of this work I also implemented a QueryExportTab as a plugin for the
Protege-2000 KB-editor. It is a slightly modified version of the traditional QueriesTab, where I added an export
functionality to allow users to export the results of their queries as a tab delimited text file.
This plugin is useful for the current stable version and the Protege versions before 3.0 Beta (build 40). Later versions integrate this
code in their standard distribution.
Last updated February 2005. Visitors since nov. 5th 2004:

E-Mail: schober@mdc-berlin.de