Summary of Penn Ontology Workshop (POW) 3

3-5 March 2003, U.Penn.

 

Present: Chris Stoeckert, Trish Whetzel, Helen Parkinson, John Matese, Paul Spellman, Joe White.

 

Aims of POW3:

Completion of the Core Ontology

Adding instances

Identifying contact domains for annotation.

Full annotation examples, including BioMaterial.

Education in area of data transformation to get new instances.

 

Future POW 4

Possible date May 10, location TBD.

Invite special interests to join.

POW5? possibly in July, to focus on software which implements the ontology.

Completed annotation examples for discussion.

Interfacing with external ontologies.

Including disjoints and other constraints.

 

MGED6

Tutorials required, one for MAGE coders, one for biologists. Getting the ontology validated by MAGE developers was discussed, Angel, Ugis, Mahendra, NCICB, Genex, Jax, Jason Goncalves @ Iobion, Rosetta all possible candidates. Time frames were either pre or post MGED 6.

 

Notes on POW3

------------

 

BioSample and BioSource defs changed to make the differences explicit.

 

Instances added to classes:

HardwareType (not a long list, policy not to include supplies, just major hardware).

FailType

WarningType

ControlType

AtomicAction* cross check with protocol to ensure completeness

ComplexAction

Instances from both the Action subclasses were examined to ensure that they were correctly classified and some were moved after discussion. We also agreed that we should create as few terms as possible and that these should not overlap if possible.

Paul suggested a change_environment instance - when used a biomaterial characteristic should have changed, the instance was renamed change_biomaterial_characteristic to reflect this.

ProtocolType

ImageFormat

FeatureShape

SoftwareType

 

 

MAGE concepts addressed:

 

Transformation, is in the extended ontology, isn't part of the core as doesn't need an OntologyEntry, should be addressed after the core ontology is finished.

 

DatabaseQualityControl, intended to be descriptive text in MAGE, allowed to have OntologyEntry, created as such within the ontology as it was decided that this information should be described by an OntologyEntry for query purposes. Three new classes added, ReplicateDescripion, NormalizationDescription, QualityControlDescription plus named type classes to hold the OntologyEntry information. Instances not yet added to these classes but possible values of the QualityControlDescriptionType could be dye swap, peer review, array type quality control. NormalizationDescriptionType instance will require working with the DataProcessing working group.

 

ExperimentalFactor:

 

Examples:

ExperimentalFactor

Problems with grouping these were addressed. 1. it's not possible in MAGE to group Factors easily by Bioassay and channel:

 

10 mm NaCl, 10% glucose

150mm Nacl, 1% glucose

 

All factors can be expressed but only singly and the current MAGE model has an (un-enforceable) rule that there should be only one factor value per TopLevelBioAssay. This means that there is no way to express which combination of factors were applied, or which channel they belong two. This is a serious problem for two channel experiements.

 

Paul suggested that this is a serious problem with the model and should be resolved within the Ontology if possible. The POW took the decision to create two new Classes which describe factor value. These will be encoded in MAGE By means of PropertySets attached to Factor Values. The only one factor value per TopLevelBioAssay rule will be discarded as it is not enforceable anyway. This will not affect code and a small test MAGE-ML example was found to be valid. Paul offered to provide a code example (action item Paul) in addition to the small one coded by Helen at the meeting.

 

ExperimentalFactorCategories

In some of the examples of the experiment package coded by various members of the group there was a problem describing certain factors. The problem is limited to those factor values that are a string.

E.g. If an experiment is a strain comparison the strain names do not belong in FactorCategory but some descriptor indicating that this is a strain comparison.

 

We therefore decided to make a new class to hold these top level terms and classes within the BioMaterialCharacteristics were used to create these instances. So that the class name StrainOrLine is also an instance strain_or_line. The class containing these instances is:

BioMaterialCharacteristicCategory example instances are strain_or_line, organism_part.It was also noted that we may not have been working from the most recent PNG's. The most recent and extensive Documentation prepared by Cathy Ball has some PNG's which need to be replaced. The most recent PNG's were distributed and Helen offered to update Cathy's documentation.

A second task is that the Ontology was constructed on the basis of the previous documentation and now needs to be checked to ensure all the OntologyEntries are represented. (*action item Helen).

 

Databases and external Ontology entries.

In the revised MAGE png's there is an association from OntologyEntry via OntologyReference to DatabaseEntry, this in turm has an assn to DataBase. Helen proposed that these should be within the ontology, along with URIs and descriptions so that both databases (like ArrayExpress) and users of the Ontology can clearly identify databases. This information will be in both the Ontology and declared in the DescriptionPackge of MAGE and these will therefore be consistent.

 

 

Meeting with Gilberto Fragoso and LiJu Fan from NCICB.

 

The various tools and ontologies from NCICB were introduced. Some of these will be very useful, e.g., compounds, drugs, cancer anatomy terms. Some NCICB terms are taken from other known ontologies, e.g., mouse terms from the Jackson Labs. The ontologies are not in a wholly useable form as they may not have visible unique id's which are easy for a user to point at and include in their annotation. NCICB are interested in doing some work to make these id's available.

 

The gene expression database at NCI already uses some of the MIAME concepts, LiJun brought a mapping of the MIAME concepts to NCI own terms concepts. These need to be modified to reflect changes in MAGE and these were discussed. LiJun now has the MAGE-MIAME mapping document.

 

Action Items:

 

Code Factors grouped via PropertySets and applied multiply to TopLevelBioAssays, Paul

Coding of a Cancer data set to show to Gilberto and NCICB, Helen

 

Check through Ontology for:

typos, Trish

missing associations especially of instances, Trish

instances, properties, classes all defined an all in use, report on those which are not, Helen

Compile a list of databases and ontologyentries, update the website with these plus add appropriate descriptions that poimnt users to multiple vocabularies where they exist. E.g.TAIR has plant specific GO terms, OrganismPart and DevelopmentalStage terms. John/Helen

 

Trish has raised additional items::

checking has_type assn to ensure used consistently -Trish

modification of  defintion of "cell" for MaterialType to be

stated as "one or more cell excluding single cell organisms".