Summary of MGED Ontology Build Workshop (Penn Ontology Workshop 2)

 

From Jan 13 - 15, 2003, a workshop was held at the University of Pennsylvania in aid of building the MGED Ontology. Attending were Cathy Ball (Stanford), Helen Parkinson (EBI), Paul Spellman (UC, Berkeley), Chris Stoeckert (Penn), Trish Whetzel (Penn), and Joe White (TIGR). This was a follow up to the work started in early December, 2002 by Helen Parkinson, Chris Stoeckert, and Trish Whetzel to transform the MGED Ontology into a MAGE-supportive MGED COre Ontology. The aims of the workshop (which were largely achieved) were to:

 

1. Get everyone to be familiar with the current ontology and the way it has been built.

2. Make a pass through the current ontology and agree on the classes and properties.

3. Make significant progress in the addition of new instances.

 

Core Ontology approach:

1. Organize ontology according to MAGE packages

2. Class names have been changed to facilitate mapping and properties adjusted to avoid conflicts.

3. Core ontology does not cover all of MAGE or contain all MAGE associations. Scope of ontology is supporting MAGE ontology entry associations, adding constraints within MAGE structure, and adding further structure and depth to MAGE classes without changing base classes.

4. Definitions are derived from MAGE OMG document Gene Expression Specification where appropriate and available.

5. Enumerated lists are represented as instances of a class where these lists are open ended.

6. Classes not found in MAGE were moved to the extended ontology except in cases where subclasses are used to organize instances.

7. When instances for a single class became numerous, subclasses to conceptually organize the instances were used.

8. Associations not found in MAGE classes were removed.

9. When simply an association to a controlled term list, the property "has_type" is used except in special MAGE cases such as has_biosample_type.

10. When Ontology needed but no direct association, used describable (if describable) e.g., transformation.

11. For defining terms, include synonyms but indicate whether exact or non-exact.

12. Term identifiers will be from DAML resource. i.e., http://mged.sourceforge.net/ontologies/MGEDCoreOntology.daml#term

 

Discussions of specific classes:

1. BioAssayDataCluster - does this refer only to data transformed by a clustering algorithm? Or is it for any higher level analysis? (differential expression, ANOVA, etc.). It was decided that it could be applied to any higher level analysis. Node represents any group of objects with ontology entries for node type, scale and value.

2. We will further structure replicates, normalization, quality control in the extende ontology and they will have descriptions - so not using MAGE specified associations for these descriptions.

3. Extended ontology will constrain experimental factors to treatments, biosourcecharacteristics, etc. This will constrain terms for factor values since factor values are not part of core ontology.

4. Need axiom for experimental factor category - disjoint.

5. Authority for Units - NIST: UNC dictionary.

6. MaterialType is used to describe what you have after generating a biosample starting with an organism and progressing through smaller parts down to purified material to label.

7. Treatment: no instances: distinguish atomic action from complex action. Both end up in MAGE action. Will be the same instances as protocol type. May just propagate in forms but can't specify in ontology as it would be adding an association. Treatment subclasses removed (note need to remove associated properties).  Growth condition subclasses are now just captured as protocol parameters <would need to put in extended ontology if desired>

8. Protocol subclasses moved to instances of protocol type. Create ProtocolParameterType in ExtendedOntology

9. Technology type and array manufacture protocol type can share instances.

10. Role: according to MAGE, just a tag for contacts; <-ContactType

11. Need BioAssayDataType in extended ontology

12. GeneticMaterial -> moved to extended ontology since it is only referred to by genetic modification (in ext. ontology)

13. Experiment design type subclasses are: biomolecular annotation, methodological, perturbational, biological properties, epidemiological. Note that these can be used in combinations. Also that these are used for big picture classifications.

 

MAGE Issues:

Two issues that might be resolved by changes in MAGE were discussed by conference call with Michael Miller (Rossetta). Both issues concerned cardinality constraint changes. The first was a request to change the association between Experiment and ExperimentDesign from [1, 1] to [1, n]. This change would allow a single experiment to represent multiple experimental designs as often occurs in a published study. The other request was to make the association from BioSample to BioSampleType nullable i.e, from [1, 1] to [0, 1]. BioSampleType is only used to specify which BioSample represents the final state or extract prior to a LabeledExtract. Michael will bring these changes to the OMG committee on revisions for inclusion in MAGE v1.1. These will be adopted unless the changes will cause code written to the v1.0 specifications to break.

 

Classes needing instances:

Experiment Factor categories

protocol types

actions

class missing - defect type (feature, zone)

class missing - fiducial types

Controls

fail types

warning types

image file format

scale

class missing - data type

hardware type

software type

 

Usage of Experiment factor categories:

use case 1: glucose concentration changes over time

            Design type = compound treatment, time series

            Factor category = glucose, time

            Measurement = [concentration], [time point]  for each BioAssay

use case 2: comparing tumor types

            design type = cell type comparison, disease state, individual variation

            factor category = tumor type A, tumor type B, cell line 1, cell line 2

            measurement = yes or no for each bioassay

 

Test papers to illustrate usage of instances for ExperimentDesignType, FactorCategory, Measurement.

Joe - genome annotation

Helen - perturbational

Cathy - biological property

Chris/Trish - methodological