Summary
of MGED Ontology Build Workshop (Penn Ontology Workshop 2)
From
Jan 13 - 15, 2003, a workshop was held at the University of Pennsylvania in aid
of building the MGED Ontology. Attending were Cathy Ball (Stanford), Helen
Parkinson (EBI), Paul Spellman (UC, Berkeley), Chris Stoeckert (Penn), Trish
Whetzel (Penn), and Joe White (TIGR). This was a follow up to the work started
in early December, 2002 by Helen Parkinson, Chris Stoeckert, and Trish Whetzel
to transform the MGED Ontology into a MAGE-supportive MGED COre Ontology. The
aims of the workshop (which were largely achieved) were to:
1.
Get everyone to be familiar with the current ontology and the way it has been
built.
2.
Make a pass through the current ontology and agree on the classes and
properties.
3.
Make significant progress in the addition of new instances.
Core
Ontology approach:
1.
Organize ontology according to MAGE packages
2.
Class names have been changed to facilitate mapping and properties adjusted to
avoid conflicts.
3.
Core ontology does not cover all of MAGE or contain all MAGE associations. Scope
of ontology is supporting MAGE ontology entry associations, adding constraints
within MAGE structure, and adding further structure and depth to MAGE classes
without changing base classes.
4.
Definitions are derived from MAGE OMG document Gene Expression Specification
where appropriate and available.
5.
Enumerated lists are represented as instances of a class where these lists are
open ended.
6.
Classes not found in MAGE were moved to the extended ontology except in cases
where subclasses are used to organize instances.
7.
When instances for a single class became numerous, subclasses to conceptually
organize the instances were used.
8.
Associations not found in MAGE classes were removed.
9.
When simply an association to a controlled term list, the property
"has_type" is used except in special MAGE cases such as
has_biosample_type.
10.
When Ontology needed but no direct association, used describable (if
describable) e.g., transformation.
11.
For defining terms, include synonyms but indicate whether exact or non-exact.
12.
Term identifiers will be from DAML resource. i.e.,
http://mged.sourceforge.net/ontologies/MGEDCoreOntology.daml#term
Discussions
of specific classes:
1.
BioAssayDataCluster - does this refer only to data transformed by a clustering
algorithm? Or is it for any higher level analysis? (differential expression,
ANOVA, etc.). It was decided that it could be applied to any higher level
analysis. Node represents any group of objects with ontology entries for node
type, scale and value.
2.
We will further structure replicates, normalization, quality control in the
extende ontology and they will have descriptions - so not using MAGE specified
associations for these descriptions.
3.
Extended ontology will constrain experimental factors to treatments,
biosourcecharacteristics, etc. This will constrain terms for factor values
since factor values are not part of core ontology.
4.
Need axiom for experimental factor category - disjoint.
5.
Authority for Units - NIST: UNC dictionary.
6.
MaterialType is used to describe what you have after generating a biosample
starting with an organism and progressing through smaller parts down to
purified material to label.
7.
Treatment: no instances: distinguish atomic action from complex action. Both
end up in MAGE action. Will be the same instances as protocol type. May just
propagate in forms but can't specify in ontology as it would be adding an
association. Treatment subclasses removed (note need to remove associated
properties). Growth condition
subclasses are now just captured as protocol parameters <would need to put
in extended ontology if desired>
8.
Protocol subclasses moved to instances of protocol type. Create
ProtocolParameterType in ExtendedOntology
9.
Technology type and array manufacture protocol type can share instances.
10.
Role: according to MAGE, just a tag for contacts; <-ContactType
11.
Need BioAssayDataType in extended ontology
12.
GeneticMaterial -> moved to extended ontology since it is only referred to
by genetic modification (in ext. ontology)
13.
Experiment design type subclasses are: biomolecular annotation, methodological,
perturbational, biological properties, epidemiological. Note that these can be
used in combinations. Also that these are used for big picture classifications.
MAGE
Issues:
Two
issues that might be resolved by changes in MAGE were discussed by conference
call with Michael Miller (Rossetta). Both issues concerned cardinality
constraint changes. The first was a request to change the association between
Experiment and ExperimentDesign from [1, 1] to [1, n]. This change would allow
a single experiment to represent multiple experimental designs as often occurs
in a published study. The other request was to make the association from
BioSample to BioSampleType nullable i.e, from [1, 1] to [0, 1]. BioSampleType
is only used to specify which BioSample represents the final state or extract
prior to a LabeledExtract. Michael will bring these changes to the OMG
committee on revisions for inclusion in MAGE v1.1. These will be adopted unless
the changes will cause code written to the v1.0 specifications to break.
Classes
needing instances:
Experiment
Factor categories
protocol
types
actions
class
missing - defect type (feature, zone)
class
missing - fiducial types
Controls
fail
types
warning
types
image
file format
scale
class
missing - data type
hardware
type
software
type
Usage
of Experiment factor categories:
use
case 1: glucose concentration changes over time
Design
type = compound treatment, time series
Factor
category = glucose, time
Measurement
= [concentration], [time point]
for each BioAssay
use
case 2: comparing tumor types
design
type = cell type comparison, disease state, individual variation
factor
category = tumor type A, tumor type B, cell line 1, cell line 2
measurement
= yes or no for each bioassay
Test
papers to illustrate usage of instances for ExperimentDesignType,
FactorCategory, Measurement.
Joe
- genome annotation
Helen
- perturbational
Cathy
- biological property
Chris/Trish - methodological