microarray-ontol-digest Wednesday, January 16 2002 Volume 01 : Number 018 ---------------------------------------------------------------------- Date: Mon, 31 Dec 2001 11:06:49 -0800 From: "Miller, Michael" Subject: RE: [microarray-ontol] Expressive power of Ontologies Chris, Thanks for this reply, it also answers the question I asked earlier about whether values can or should be part of an Ontology. The answer is clearly 'yes', if I've finally got this right, but in a way that two people using the same Ontology can definitively answer truths about equality where as if two people are using two different ontologies, for instance, for 'sex gender mating type', then truths about equality can't be made without a bridge/mapping between the two ontologies (which may still be an incomplete mapping if one ontology has distinctions that the other one doesn't). So, though I don't think the MAGE standard can dictate what Ontologies can be used, we will almost certainly need to modify it as the use cases for Ontologies become clearer. And, although I don't think MAGE can dictate what Ontologies to use, repositories that accept data in MAGE format certainly are free to require use of specific Ontologies. Regards, Michael > -----Original Message----- > From: Chris Stoeckert [mailto:stoeckrt@SNOWBALL.pcbi.upenn.edu] > Sent: Monday, December 31, 2001 9:50 AM > To: jason@openinformatics.com > Cc: microarray-ontol@ebi.ac.uk > Subject: Re: [microarray-ontol] Re: sex gender mating type > > > Hey Jason, > I'm delighted that you MAGE guys are joining our discussions! You are > right that this is a fundamental issue and one that should be > driven by > "use cases." The issue that I'm referring to is the extent > and purpose > of the MGED ontology. Should it simply be a collection of pointers to > controlled vocabulary terms or IDs as you proposed in your mail? My > answer is "No." This was the initial goal and the purpose of the > Ontology Resources at the MGED OWG home page > (http://www.cbil.upenn.edu/Ontology/index.html#ontology.resour > ces). This > does address the use case of "assigning terms to data, retrieving > similar data and not dis-similar ones" but not completely as I'll > discuss later. > > Ontologies can be more than (hierarchical) controlled > vocabulary terms > with definitions. They can describe objects (subclassing AND foreign > keys). They can provide constraints as to how those objects > can be built > so that a computer can use an ontology to reason whether an > object makes > sense (semantic checking) and infer parts of the object that haven't > been explicitly added. > > There are two very pressing use cases that require this reasoning > capability. The first is "Is this a MIAME-compliant > document?" Because > many fields describing the sample/biomaterial in a microarray > experiment > are context dependent, checking for null values is not very > useful. An > ontology could be used to determine if the fields (and their > values) are > complete and make logical sense. The second use case is minimizing > effort to enter annotation. Rather than go through page after page of > forms that are designed for general input, an ontology could > be used to > determine what are appropriate fields given some initial > input and fill > some of the remaining fields in. The current MGED ontology > does not yet > have this expressive power but can with addition of constraints. > > Getting back to the use case of "retrieving similar data and not > dis-similar ones," the resolution of what is similar will be > determined > by the expressive power of the ontology. In the simplest > case, a list of > properties, their values and sources can be compared. While valuable > (even if you just get the same species), this approach is limited to > properties that can be described by a single value. Complex > properties > (objects with attributes) such as age, biosource provider, > environmental > history (e.g.,culture condition), and treatment (e.g.,somatic > modification) can not be each captured with a single controlled > vocabulary term or external ID. > > Like MAGE, the MGED ontology will be best understood when software is > written to make use of it. Right now we need to flesh out the MGED > ontology a bit more but I believe that Helen Parkinson, > Susanna Sansone, > and others in Alvis Brazma's group are designing their > annotation forms > for ArrayExpress using this. We will be revising our forms > for RAD and > building this into our MAGE exporter for RAD. I would love to > hear what > others are doing. > > Chris > ------------------------------ Date: 31 Dec 2001 13:14:33 -0700 From: jason@openinformatics.com (Jason E. Stewart) Subject: Re: [microarray-ontol] Expressive power of Ontologies "Miller, Michael" writes: > So, though I don't think the MAGE standard can dictate what > Ontologies can be used, we will almost certainly need to modify it > as the use cases for Ontologies become clearer. And, although I > don't think MAGE can dictate what Ontologies to use, repositories > that accept data in MAGE format certainly are free to require use of > specific Ontologies. Well put Michael, I completely agree. jas. ------------------------------ Date: Mon, 31 Dec 2001 15:44:06 -0500 From: Chris Stoeckert Subject: [microarray-ontol] filtered mail Dear Group, As owner of this mailing list I've been getting mail delivery errors that arise from "advertisement" filters. We've had a thread with a subject header that included the word sex and I suspect is causing trouble. For those of you who haven't gotten these mails I would point you to the archive at http://bfx.kribb.re.kr/microarray/date.html. To avoid this problem, I would ask posters (I've been guilty of this too) to keep these advertisement filters in mind when composing email subject titles. Thanks, Chris ------------------------------ Date: Mon, 31 Dec 2001 16:24:30 -0500 From: Chris Stoeckert Subject: [microarray-ontol] ontology issues Happy New Year everyone! Issues have come up while working on version 1.2 of the ontology that could use group input. Here are a couple 1. ClinicalHistory: Has the ontology entry, ClinicalInformation, which originally was a way to point to a medical record somewhere. Is that really useful? It also has a lab value which is some Measurement. What's missing is a way to describe what the lab value is, when it was done, and how it relates to the microarray experiment. I suggest we add a date attribute and replace ClinicalInformation with LabTest where lab test is an OntologyEntry. An instance of LabTest is LOINC (http://www.regenstrief.org/loinc/) > The laboratory portion of the LOINC database contains the usual > categories of chemistry, hematology, serology, microbiology (including > parasitology and virology), and toxicology; as well as categories for > drugs and the cell counts you would find reported on a complete blood > count or a cerebrospinal fluid cell count. Antibiotic susceptibilities > are a separate category. The clinical portion of the LOINC database > includes entries for vital signs, hemodynamics, intake/output, EKG, > obstetric ultrasound, cardiac echo, urologic imaging, gastroendoscopic > procedures, pulmonary ventilator management, selected survey > instruments, and other clinical observations. We could also add a description attribute or give EnvironmentalHistory this attribute which ClinicalHistory, BarrierFacility, Bedding, CultureCondition, etc would inherit as an attribute. I'm inclined to be stingy with attributes to start with and generalize when necessary. 2. DensityRange. This is a subclass of CultureCondition. The only attribute is Measurement (inherited from CultureCondition). So what's measured? What are the units? A real example would be 10,000 to 1,000,000 cells/ml. We could have the measurement be the minimum concentration measurement with value = 10,000 and units = cells/ml and either add another measurement (maximum concentration) or add a range attribute (100). Thanks, Chris Chris Stoeckert, Ph.D. Research Associate Professor, Dept. of Genetics Center for Bioinformatics, University of Pennsylvania 418 Guardian Dr., Philadelphia, PA 19104 Ph:215-573-4409 FAX:215-573-3111 ------------------------------ Date: Thu, 03 Jan 2002 12:30:18 +0000 From: Susanna Sansone Subject: Re: [microarray-ontol] Re: sex gender mating type Hello Chris, Happy New Year to you too! Re: "ConjugationType" or "MatingType" It is F+ (capital). I think the class "MatingType" it is fine for bacteria too (as they are called mating type or cross type). A subclass to "MatingType" for bacteria called "ConjugationType" to help clarify may be also confusing for others as: protozoa exchange the entire nuclei in conjugation and fungi also conjugation occurs between hyphae of opposit mating type. "Conjugation is the temporary union of 2 single celled organisms or hyphae with at least one receiving genetic material from the other. In bacteria the exchange is unidirectional from male to female (Dictionary of genetics, King & Stansfield)" Cheers, Susanna Chris Stoeckert wrote: > > Thanks everyone (especially Helen) for the recent discussion on the > biomaterial description of sex. > In trying to come up with a consensus to incorporate into the ontology, > I am stuck when trying to provide instances of mating type as these are > species specific. The type "a" is used by S. cerevisiae and N. crassa. > Are they the same? Are H and h different? Is it F+ or f+? Do we need to > go the "sensu" route (i.e. in the sense of) as the GO people have? I > would rather not because the constraints are then placed in the text. We > could add a subclass to "MatingType" for bacteria called > "ConjugationType" to help clarify what term to use. > From: BioTech Life Science Dictionary: > http://biotech.icmb.utexas.edu/search/dict-search.phtml > > bacterial conjugation > > Definition: > > The process of transferring a certain plasmid of DNA known as the F > > plasmid (or sex plasmid) from bacteria individuals who have it (known > > as "males") to bacteria individuals who do not already have it (known > > as "females") by way of direct contact between the bacteria individuals > > called a conjugation bridge. Once transfer is completed, the female > > individual becomes a male individual and both parties have a copy of > > the F plasmid. > > They didn't have a definition for mating type or gender but did give a > search field for the On-line medical dictionary. > http://www.graylab.ac.uk/cgi-bin/omd?action=Home&query= > Note: "Gender is a grammatical distinction and applies to words only. > Sex is natural distinction and applies to living objects." (R. Morris) > > Is the "Sex" class the natural place for mating type information or > would that be included as "IndividualGeneticCharacteristic" or "Strain" > as Helen indicated in the beginning of the discussion. The "Sex" class > could be constrained to organisms producing gametes if one normally > lists mating type along with allele or strain info. > > Cheers, > Chris > > On Wednesday, December 19, 2001, at 11:46 AM, Helen Parkinson wrote: > > > Hi, > > > > suggested revision of sex, now has two subclasses and new > > definitions and a few instances, > > > > > > comments or more instances anyone or a better word than > > gender? > > > > cheers > > > > helen > > > > > >> -------------- > >> &sex > >> &gender > >> &mating type > >> -------------- > >> > >> %sex > >> > >> term applied to any organism able to undergo sexual > >> reproduction in order to differentiate the two individuals > >> or types involved. Sexual reproduction is defined as the > >> ability to exchange genetic material with the potential of > >> recombinant progeny. > >> > >> %gender > > subclass of sex applicable to heterogametic species (i.e. > > those in which > > the sexes produce gametes of markedly different size) > >> > >> instances > >> --------- > >> male -producing small numerous gametes > >> female -producing small numbers of large gametes > >> both -suggest replacing with: mosaic and hermaphrodite > >> mixed -a population of individuals with any or all of the above > >> present > >> unknown > >> %mating type > >> instances divided up into species for ease of use, probably > >> incomplete, possibly wrong > >> --------- > >> Candida albicans > >> H > >> H+ > >> > >> Saccharomyces cerevisiae > >> a > >> alpha > >> a/alpha > >> > > Neurospora Crassa > >> A > >> a > >> > >> Schizosaccharomyces pombe > >> h+ > >> h- > > > > ---------------------------------------------------------------------------------- > > thanks to Midori and Sanger people for help with the > > definitions and instances etc > > - -- ******************************* Susanna Assunta Sansone, PhD Microarray Informatics EBI - The European Bioinformatics Institute EMBL Outstation - Hinxton, Wellcome Trust Genome Campus Cambridge CB10 1SD, UK email: sansone@ebi.ac.uk direct: +44 (0)1223 494 691 fax: +44 (0)1223 494 468 http://www.ebi.ac.uk/microarray ******************************* ------------------------------ Date: Thu, 03 Jan 2002 12:57:18 +0000 From: Helen Parkinson Subject: Re: [microarray-ontol] Re: sex gender mating type Hi, I agree with Susanna, if we add special sub classes for bacteria then ogically we could have them for fungi etc, then it becomes as complex as using species specific instances. helen Susanna Sansone wrote: > > Hello Chris, > Happy New Year to you too! > > Re: "ConjugationType" or "MatingType" > It is F+ (capital). > I think the class "MatingType" it is fine for bacteria too (as they are called > mating type or cross type). > A subclass to "MatingType" for bacteria called "ConjugationType" to help clarify > may be also confusing for others as: > protozoa exchange the entire nuclei in conjugation and fungi also conjugation > occurs between hyphae of opposit mating type. > "Conjugation is the temporary union of 2 single celled organisms or hyphae with > at least one receiving genetic material from the other. In bacteria the exchange > is unidirectional from male to female (Dictionary of genetics, King & > Stansfield)" > > Cheers, > Susanna > > Chris Stoeckert wrote: > > > > Thanks everyone (especially Helen) for the recent discussion on the > > biomaterial description of sex. > > In trying to come up with a consensus to incorporate into the ontology, > > I am stuck when trying to provide instances of mating type as these are > > species specific. The type "a" is used by S. cerevisiae and N. crassa. > > Are they the same? Are H and h different? Is it F+ or f+? Do we need to > > go the "sensu" route (i.e. in the sense of) as the GO people have? I > > would rather not because the constraints are then placed in the text. We > > could add a subclass to "MatingType" for bacteria called > > "ConjugationType" to help clarify what term to use. > > From: BioTech Life Science Dictionary: > > http://biotech.icmb.utexas.edu/search/dict-search.phtml > > > bacterial conjugation > > > Definition: > > > The process of transferring a certain plasmid of DNA known as the F > > > plasmid (or sex plasmid) from bacteria individuals who have it (known > > > as "males") to bacteria individuals who do not already have it (known > > > as "females") by way of direct contact between the bacteria individuals > > > called a conjugation bridge. Once transfer is completed, the female > > > individual becomes a male individual and both parties have a copy of > > > the F plasmid. > > > > They didn't have a definition for mating type or gender but did give a > > search field for the On-line medical dictionary. > > http://www.graylab.ac.uk/cgi-bin/omd?action=Home&query= > > Note: "Gender is a grammatical distinction and applies to words only. > > Sex is natural distinction and applies to living objects." (R. Morris) > > > > Is the "Sex" class the natural place for mating type information or > > would that be included as "IndividualGeneticCharacteristic" or "Strain" > > as Helen indicated in the beginning of the discussion. The "Sex" class > > could be constrained to organisms producing gametes if one normally > > lists mating type along with allele or strain info. > > > > Cheers, > > Chris > > > > On Wednesday, December 19, 2001, at 11:46 AM, Helen Parkinson wrote: > > > > > Hi, > > > > > > suggested revision of sex, now has two subclasses and new > > > definitions and a few instances, > > > > > > > > > comments or more instances anyone or a better word than > > > gender? > > > > > > cheers > > > > > > helen > > > > > > > > >> -------------- > > >> &sex > > >> &gender > > >> &mating type > > >> -------------- > > >> > > >> %sex > > >> > > >> term applied to any organism able to undergo sexual > > >> reproduction in order to differentiate the two individuals > > >> or types involved. Sexual reproduction is defined as the > > >> ability to exchange genetic material with the potential of > > >> recombinant progeny. > > >> > > >> %gender > > > subclass of sex applicable to heterogametic species (i.e. > > > those in which > > > the sexes produce gametes of markedly different size) > > >> > > >> instances > > >> --------- > > >> male -producing small numerous gametes > > >> female -producing small numbers of large gametes > > >> both -suggest replacing with: mosaic and hermaphrodite > > >> mixed -a population of individuals with any or all of the above > > >> present > > >> unknown > > >> %mating type > > >> instances divided up into species for ease of use, probably > > >> incomplete, possibly wrong > > >> --------- > > >> Candida albicans > > >> H > > >> H+ > > >> > > >> Saccharomyces cerevisiae > > >> a > > >> alpha > > >> a/alpha > > >> > > > Neurospora Crassa > > >> A > > >> a > > >> > > >> Schizosaccharomyces pombe > > >> h+ > > >> h- > > > > > > ---------------------------------------------------------------------------------- > > > thanks to Midori and Sanger people for help with the > > > definitions and instances etc > > > > > -- > ******************************* > Susanna Assunta Sansone, PhD > > Microarray Informatics > > EBI - The European > Bioinformatics Institute > EMBL Outstation - Hinxton, > Wellcome Trust Genome Campus > Cambridge CB10 1SD, UK > > email: sansone@ebi.ac.uk > direct: +44 (0)1223 494 691 > fax: +44 (0)1223 494 468 > http://www.ebi.ac.uk/microarray > ******************************* ------------------------------ Date: Thu, 03 Jan 2002 13:23:56 +0000 From: Susanna Sansone Subject: Re: [microarray-ontol] ontology issues Hi Chris, Re: 1. ClinicalHistory: Has the ontology entry, ClinicalInformation= Is that really useful? For the ILSI dataset we may need to incorporate biological data: - -toxicological endpoints (hepatotoxicity, nephrotoxicity and genotoxicity), containing probably Clinical Observations for the in vivo treatments; - -Chemical Chemistry results (serum chemistry, hematology); - -Histopathology ..but still don't know to which extend. Re: 2. In Measurement we could consider also OD and probably CFU/ml as units (micro-organisms-bacteria, yeasts develop -when plated- into visible colonies (CFU=colony forming units) which are counted -colony may result from a single micro-organism/cell or from a clump-). Cheers, Susanna Chris Stoeckert wrote: > > Happy New Year everyone! > Issues have come up while working on version 1.2 of the ontology that > could use group input. Here are a couple > 1. ClinicalHistory: Has the ontology entry, ClinicalInformation, which > originally was a way to point to a medical record somewhere. Is that > really useful? It also has a lab value which is some Measurement. What's > missing is a way to describe what the lab value is, when it was done, > and how it relates to the microarray experiment. I suggest we add a date > attribute and replace ClinicalInformation with LabTest where lab test is > an OntologyEntry. An instance of LabTest is LOINC > (http://www.regenstrief.org/loinc/) > > The laboratory portion of the LOINC database contains the usual > > categories of chemistry, hematology, serology, microbiology (including > > parasitology and virology), and toxicology; as well as categories for > > drugs and the cell counts you would find reported on a complete blood > > count or a cerebrospinal fluid cell count. Antibiotic susceptibilities > > are a separate category. The clinical portion of the LOINC database > > includes entries for vital signs, hemodynamics, intake/output, EKG, > > obstetric ultrasound, cardiac echo, urologic imaging, gastroendoscopic > > procedures, pulmonary ventilator management, selected survey > > instruments, and other clinical observations. > We could also add a description attribute or give EnvironmentalHistory > this attribute which ClinicalHistory, BarrierFacility, Bedding, > CultureCondition, etc would inherit as an attribute. I'm inclined to be > stingy with attributes to start with and generalize when necessary. > > 2. DensityRange. This is a subclass of CultureCondition. The only > attribute is Measurement (inherited from CultureCondition). So what's > measured? What are the units? A real example would be 10,000 to > 1,000,000 cells/ml. We could have the measurement be the minimum > concentration measurement with value = 10,000 and units = cells/ml and > either add another measurement (maximum concentration) or add a range > attribute (100). > > Thanks, > Chris > > Chris Stoeckert, Ph.D. > Research Associate Professor, Dept. of Genetics > Center for Bioinformatics, University of Pennsylvania > 418 Guardian Dr., Philadelphia, PA 19104 > Ph:215-573-4409 FAX:215-573-3111 - -- ******************************* Susanna Assunta Sansone, PhD Microarray Informatics EBI - The European Bioinformatics Institute EMBL Outstation - Hinxton, Wellcome Trust Genome Campus Cambridge CB10 1SD, UK email: sansone@ebi.ac.uk direct: +44 (0)1223 494 691 fax: +44 (0)1223 494 468 http://www.ebi.ac.uk/microarray ******************************* ------------------------------ Date: Thu, 3 Jan 2002 12:47:50 -0500 From: Chris Stoeckert Subject: Re: [microarray-ontol] Re: gender mating type OK, scratch ConjugationType. Do we go with "sensu" to describe species-specific mating types? Or do we put mating type information into a different class than Sex? Chris On Thursday, January 3, 2002, at 07:57 AM, Helen Parkinson wrote: > Hi, > > I agree with Susanna, if we add special sub classes for > bacteria then ogically we could have them for fungi etc, > then it becomes as complex as using species specific > instances. > > helen > > Susanna Sansone wrote: >> >> Hello Chris, >> Happy New Year to you too! >> >> Re: "ConjugationType" or "MatingType" >> It is F+ (capital). >> I think the class "MatingType" it is fine for bacteria too (as they >> are called >> mating type or cross type). >> A subclass to "MatingType" for bacteria called "ConjugationType" to >> help clarify >> may be also confusing for others as: >> protozoa exchange the entire nuclei in conjugation and fungi also >> conjugation >> occurs between hyphae of opposit mating type. >> "Conjugation is the temporary union of 2 single celled organisms or >> hyphae with >> at least one receiving genetic material from the other. In bacteria >> the exchange >> is unidirectional from male to female (Dictionary of genetics, King & >> Stansfield)" >> >> Cheers, >> Susanna >> >> Chris Stoeckert wrote: >>> >>> Thanks everyone (especially Helen) for the recent discussion on the >>> biomaterial description of sex. >>> In trying to come up with a consensus to incorporate into the >>> ontology, >>> I am stuck when trying to provide instances of mating type as these >>> are >>> species specific. The type "a" is used by S. cerevisiae and N. crassa. >>> Are they the same? Are H and h different? Is it F+ or f+? Do we need >>> to >>> go the "sensu" route (i.e. in the sense of) as the GO people have? I >>> would rather not because the constraints are then placed in the text. >>> We >>> could add a subclass to "MatingType" for bacteria called >>> "ConjugationType" to help clarify what term to use. >>> From: BioTech Life Science Dictionary: >>> http://biotech.icmb.utexas.edu/search/dict-search.phtml >>>> bacterial conjugation >>>> Definition: >>>> The process of transferring a certain plasmid of DNA known as the F >>>> plasmid (or sex plasmid) from bacteria individuals who have it (known >>>> as "males") to bacteria individuals who do not already have it (known >>>> as "females") by way of direct contact between the bacteria >>>> individuals >>>> called a conjugation bridge. Once transfer is completed, the female >>>> individual becomes a male individual and both parties have a copy of >>>> the F plasmid. >>> >>> They didn't have a definition for mating type or gender but did give a >>> search field for the On-line medical dictionary. >>> http://www.graylab.ac.uk/cgi-bin/omd?action=Home&query= >>> Note: "Gender is a grammatical distinction and applies to words only. >>> Sex is natural distinction and applies to living objects." (R. Morris) >>> >>> Is the "Sex" class the natural place for mating type information or >>> would that be included as "IndividualGeneticCharacteristic" or >>> "Strain" >>> as Helen indicated in the beginning of the discussion. The "Sex" class >>> could be constrained to organisms producing gametes if one normally >>> lists mating type along with allele or strain info. >>> >>> Cheers, >>> Chris >>> >>> On Wednesday, December 19, 2001, at 11:46 AM, Helen Parkinson wrote: >>> >>>> Hi, >>>> >>>> suggested revision of sex, now has two subclasses and new >>>> definitions and a few instances, >>>> >>>> >>>> comments or more instances anyone or a better word than >>>> gender? >>>> >>>> cheers >>>> >>>> helen >>>> >>>> >>>>> -------------- >>>>> &sex >>>>> &gender >>>>> &mating type >>>>> -------------- >>>>> >>>>> %sex >>>>> >>>>> term applied to any organism able to undergo sexual >>>>> reproduction in order to differentiate the two individuals >>>>> or types involved. Sexual reproduction is defined as the >>>>> ability to exchange genetic material with the potential of >>>>> recombinant progeny. >>>>> >>>>> %gender >>>> subclass of sex applicable to heterogametic species (i.e. >>>> those in which >>>> the sexes produce gametes of markedly different size) >>>>> >>>>> instances >>>>> --------- >>>>> male -producing small numerous gametes >>>>> female -producing small numbers of large gametes >>>>> both -suggest replacing with: mosaic and hermaphrodite >>>>> mixed -a population of individuals with any or all of the above >>>>> present >>>>> unknown >>>>> %mating type >>>>> instances divided up into species for ease of use, probably >>>>> incomplete, possibly wrong >>>>> --------- >>>>> Candida albicans >>>>> H >>>>> H+ >>>>> >>>>> Saccharomyces cerevisiae >>>>> a >>>>> alpha >>>>> a/alpha >>>>> >>>> Neurospora Crassa >>>>> A >>>>> a >>>>> >>>>> Schizosaccharomyces pombe >>>>> h+ >>>>> h- >>>> >>>> ---------------------------------------------------------------------------------- >>>> thanks to Midori and Sanger people for help with the >>>> definitions and instances etc >>>> >> >> -- >> ******************************* >> Susanna Assunta Sansone, PhD >> >> Microarray Informatics >> >> EBI - The European >> Bioinformatics Institute >> EMBL Outstation - Hinxton, >> Wellcome Trust Genome Campus >> Cambridge CB10 1SD, UK >> >> email: sansone@ebi.ac.uk >> direct: +44 (0)1223 494 691 >> fax: +44 (0)1223 494 468 >> http://www.ebi.ac.uk/microarray >> ******************************* > ------------------------------ Date: 03 Jan 2002 11:57:40 -0700 From: jason@openinformatics.com (Jason E. Stewart) Subject: [microarray-ontol] ANNOUNCE: New Queries Mailing List Hey, First, my appologies for the cross-list spam. This email is to announce a new mailing list: mged-queries@lists.sf.net It's specific charter is to discuss issues of what type of queries should be supported by MAGE and MAGE-ML. In a broader context it is to help define what types of queries should Gene Expression Databases support. I will begin posting my views to this list, and I encourage all interested parties to subscribe and *participate* by going to: http://lists.sourceforge.net/lists/listinfo/mged-queries (It was just created, so it may to a few hours before it's active). Cheers, jas. ------------------------------ Date: Tue, 08 Jan 2002 11:15:12 +0000 From: Helen Parkinson Subject: Re: [microarray-ontol] Re: sex gender mating type Hi Jason, a lot of good points here and a good overview. Chris has a MAML-MIAME-ontol mapping page, this needs updating to reflect the changes from MAML to MAGE. This would address the question of where the MAGE/ontol/MIAME boundaries lie. I am willing to have a go at this, would you like to help with the MAGE parts after I generate a rough draft, re: this part of your mail: - ------------------------- An ontology is a description of knowledge. It's a very broad thing, > and it can become anything that you want to make of it. I'm not sure > what you (or others) feel about this. At the very least I see it > providing three things: > > 1) A list of properties that samples can have (taxon, cultivar, > strain, organ type, tissue type, cell type, etc) > > 2) A list of values that each of those properties can have (for organ > type: brain, heart, liver, etc) > > 3) Definitions for all of the properties and the values. All of the above is at least partially true but the problems lie with the values (instances) and how they are used, the instances that you have used as examples are the easy ones, when you get into for e.g. cell type the relationship between the instances is what is important. The discussion on mating type and sex has shown that different species use the same instance in different ways. So a simple list of values will not help, instances need to be structured, probably in a species specific way and the constraints will be complex. Also the ontology is not invariant and it will change as the ontology develops, I am thinking that when our data flow increases we may have to work to ontology releases as instances are added and terms will become redundant or obsolete as they do in the go ontology. It is not clear to me how to handle this easily, perhaps you have some ideas, cheers Helen "Jason E. Stewart" wrote: > > Hey Chris, > > I'm sending the reply to the list as I figure this is a generally > useful discussion. > > "Chris Stoeckert" writes: > > > >> I'm hoping to use the holidays to give me space to actually think > > >> about these issues. Angel filled me in on the discussion you had > > >> today. There are multiple examples beside age where a > > >> self-referential ontology entry would be needed and I hope to > > >> provide these to you next week. > > I'll put my comments after your examples. > > > Some other cases like age: > > BiosourceProvider has 3 attributes. One "biosource_type" is an > > enumerated list. The others are has_owner(Person) and > > has_donor(Organization). Person is a class with attributes: > > first_name, last_name, mid_initials, and > > has_organization(Organization). Organization has a name. Both > > Person and Organization are subclasses of Contact which has > > address, email, fax, toll_free_phone, phone, and has_URI(URI). > > > > EnvironmentalHistory is a class with subclasses such as > > CultureCondition which in turn has subclasses such as > > Nutrients. CultureCondition has the attribute > > has_measurement(Measurement) and Nutrients has the attribute > > nutrient_component(Compound). Compound is a subclass of > > OntologyEntry. > > > > Treatment is a class with subclasses such as > > Modification. Modification has a subclass SomaticModification > > which in turn has the attribute > > has_part_modified(OrganismPart) where OrganismPart is an > > OntologyEntry. Note that Treatment has_protocol(Protocol) and > > Modification has a modification_type. > > > > There are other examples but you get a sense for the complexity of > > these terms that a name, value, source can't cover alone. Also, you > > see the need to refer to other ontologies (e.g., compound) and objects > > used by MAGE (e.g., Contact, Protocol). > > Having read the three examples that you presented it is clear to me > that before we go much further we should move up a level in our > discussion. I really think we need to discuss where the boundaries of > the ontology and the MAGE model lie - where they overlap, and why. I > believe that along with MIAME, the ontology and MAGE will be important > contributions of MGED to the community, and I believe they should all > coexist happily. It feels to me that we developed MAGE without paying > too much attention (until just recently) to what was going on with the > ontology project. > > I'm getting involved not because I want to prove how clever > my ideas are, but because I want to see the ontology become something > that's really useful for people. Two years ago I knew nothing about > XML. One year ago I knew nothing about UML. Two months ago I knew > nothing about ontologies. So at best, I'm an opinionated novice. > > I'm tossing in all these caveats because I think that there's been a > lot of work put into this already, and I want to be careful about > stepping on other people toes. But I think that unless we talk about > these issues we're likely to start bumping our heads into one another. > > > Let me set some ground rules that I think will help: > > * I think we should be willing to change MIAME, MAGE, or the ontology > in order to make them work together in a synergistic manner. > > * I'm *not* convinced that MAGE is *the* correct way of modelling > anything, and so if things are better done within the ontology and I > can make MAGE smaller, I am *thrilled* to do so. > > * I believe that we should let ourselves be guided by 'use cases' - or > ways that we anticipate these three pieces will be used by the > community. If we have an important use case that we cannot support > by the current model, we should change things. Likewise we shouldn't > add new 'features' unless we can identify a use case that needs that > feature. This just provides a common ground for everyone to > understand what the motivations for our decisions are. I believe > that MAGE would be much more understandable if we would document > more of the use cases that we have explored over time. > > First, what does MAGE do? Three things: > ======================================= > > 1) Describes an *object model* that enables us to build API's in Perl, > Java, etc that tell us how to build computer data structures to > hold the data > > 2) Describes a communication format MAGE-ML that enables us to encode > the data in a text file that can be transfered over the internet. > > 3) Hopefully it will soon describe a relation DB mapping > > Second, how will MAGE get used? A couple of ways: > ================================================= > > * Software to read MAGE-ML - converts it into programming language > objects that store the data temporarily > > * Software that reads MIAME data from a database - likewise temporary > storage of data into objects > > * Software that writes data to a MAGE database - like the MIAMExpress > software > > * Software to write MAGE-ML - convert data into MAGE objects and the > existing MAGE-ML writers can do the rest > > * Software to analyze microarray data - since it will likely want to > read from and store to a MAGE DB > > Those are MAGE's high-level use cases. > > What does the ontology do? > ========================== > > An ontology is a description of knowledge. It's a very broad thing, > and it can become anything that you want to make of it. I'm not sure > what you (or others) feel about this. At the very least I see it > providing three things: > > 1) A list of properties that samples can have (taxon, cultivar, > strain, organ type, tissue type, cell type, etc) > > 2) A list of values that each of those properties can have (for organ > type: brain, heart, liver, etc) > > 3) Definitions for all of the properties and the values. > > I believe that this is the fundamental value of the ontology: by > providing a concrete terminology we can assign meaning to our data > that is unambiguous. And we can use those unambiguous meanings to > determine whether two pieces of data (two samples) are similar or > dis-similar. > > How will it be used? > ==================== > > >From reading the ontology group home page, it seems that the primary > mission of the group is develop the sample ontology as a way for > researchers to annotate the samples used in their experiments. That > helps MIAME, and it fits nicely with MAGE. Likewise I'm not sure what > you (or others) feel about this. > > At least, I see two things: > > 1) In data transmitted by MAGE-ML, a researcher uses ontology terms to > describe the samples used in the experiment > > 2) When making queries to a DB, a researcher uses ontology terms to > find experiments of relevance > > Once again, there can be more, but these are the minimal ones: > assigning terms to data, retrieving similar data and not dis-similar > ones. > > How are MAGE and the Ontology similar? > ====================================== > > At a high level, an object model and an ontology are the same thing: > descriptions of knowledge and the relationships and properties of the > information. An object model just describes it in a way that enables > us to build programming language objects from the information. > > What I'd like to propose is: > > 1) that we let the object model (MAGE) handle the information that is > variable and chaning > > 2) that we let the ontology handle the information that is invariant > and unchanging. > > For example, 'Measurement' is an object with two components: a value > and a unit. The value is a variable and therefore Measurement as an > object is in the realm of MAGE, while unit which is invariant (meaning > we can define all the units that we need - Chemistry Markup Language > did it for us) and should belong to the ontology. > > Maybe this is an overly bold suggestion, but I understand how to use > it within the MAGE-MIAME framework. Like GO it enables us to build an > unambiguous system of information to describe our experiments from the > side of the variable data: MAGE, and the invariant data: the > ontology. Because the ontolgy data is invariant each concept can have > a unique identifier in a central database, and an MAGE object that > wants to reference an ontology entry can just use that ontology > entry's identifier: e.g. MGED:100456. > > Right now, that is all that the MAGE OntologyEntry object provides: > the ability to reference a database entry in some other DB, as well as > a bit of extra information such as name and category so that it can be > useful without performing the lookup. > > jas. ------------------------------ Date: Tue, 08 Jan 2002 13:18:41 +0100 From: Martin Hofmann Subject: Re: [microarray-ontol] Re: sex gender mating type Dear All, this discussion reminds me to the discussion that we had after the 2nd MGED meeting, when the microarray ontology workgroup was established. There is obviously a sort of "community"-specific use of terms and instances and therefor I suggested (during the first meeting of the microarray ontology workgroup) to avoid to try to "cover everything in one ontology". I think one of the reasons why JAX is so successful with GXD and the connection to the Edinburgh efforts to map expression to anatomy is that the community involved is focussing on the mouse and only the mouse. Actually, most of the concepts developed by the mouse community (especially JAX) are currently being used as a sort of blueprint by people in the field of human genetics. Therefor I think we should consider to split the microarray ontology into domains that use identical (or at least similar) vocabs. To construct a microarray ontology for mammalian species (in the first line) should be much easier than paying a lot of attention to the exceptions that protozoa and microbial organisms impose on a more "general" ontology. Any community (e.g. researchers working with microbial organisms; plants, fungi etc.) can easily use the whole or parts of a microarray ontology developed for mammalian species and "customize" it for use within "their" community. As Helen already pointed out this is anyway going to happen as all ontologies are being used (or not) and being further developed by users from differnt knowledge domains. There is no ultimate / final / perfect ontology. I know that the vast majority of microarray applications at present is done using mammlian cells and organisms and therefor I suggest to solve the problems of the majority of users first. The result will be that we will come up faster with a working solution which will be used by 70 - 90 % of all microarray users (said mammlian community). Best wishes to everybody and a Happy New Year! Martin Helen Parkinson wrote: > Hi Jason, > > a lot of good points here and a good overview. > > Chris has a MAML-MIAME-ontol mapping page, this needs > updating to reflect the changes from MAML to MAGE. This > would address the question of where the MAGE/ontol/MIAME > boundaries lie. I am willing to have a go at this, would you > like to help with the MAGE parts after I generate a rough > draft, > > re: this part of your mail: > ------------------------- > An ontology is a description of knowledge. It's a very broad > thing, > > and it can become anything that you want to make of it. I'm not sure > > what you (or others) feel about this. At the very least I see it > > providing three things: > > > > 1) A list of properties that samples can have (taxon, cultivar, > > strain, organ type, tissue type, cell type, etc) > > > > 2) A list of values that each of those properties can have (for organ > > type: brain, heart, liver, etc) > > > > 3) Definitions for all of the properties and the values. > > All of the above is at least partially true but the problems > lie with the values (instances) and how they are used, the > instances that you have used as examples are the easy ones, > when you get into for e.g. cell type the relationship > between the instances is what is important. > > The discussion on mating type and sex has shown that > different species use the same instance in different ways. > So a simple list of values will not help, instances need to > be structured, probably in a species specific way and the > constraints will be complex. > > Also the ontology is not invariant and it will change as the > ontology develops, I am thinking that when our data flow > increases we may have to work to ontology releases as > instances are added and terms will become redundant or > obsolete as they do in the go ontology. It is not clear to > me how to handle this easily, perhaps you have some ideas, > > cheers > > Helen > > "Jason E. Stewart" wrote: > > > > Hey Chris, > > > > I'm sending the reply to the list as I figure this is a generally > > useful discussion. > > > > "Chris Stoeckert" writes: > > > > > >> I'm hoping to use the holidays to give me space to actually think > > > >> about these issues. Angel filled me in on the discussion you had > > > >> today. There are multiple examples beside age where a > > > >> self-referential ontology entry would be needed and I hope to > > > >> provide these to you next week. > > > > I'll put my comments after your examples. > > > > > Some other cases like age: > > > BiosourceProvider has 3 attributes. One "biosource_type" is an > > > enumerated list. The others are has_owner(Person) and > > > has_donor(Organization). Person is a class with attributes: > > > first_name, last_name, mid_initials, and > > > has_organization(Organization). Organization has a name. Both > > > Person and Organization are subclasses of Contact which has > > > address, email, fax, toll_free_phone, phone, and has_URI(URI). > > > > > > EnvironmentalHistory is a class with subclasses such as > > > CultureCondition which in turn has subclasses such as > > > Nutrients. CultureCondition has the attribute > > > has_measurement(Measurement) and Nutrients has the attribute > > > nutrient_component(Compound). Compound is a subclass of > > > OntologyEntry. > > > > > > Treatment is a class with subclasses such as > > > Modification. Modification has a subclass SomaticModification > > > which in turn has the attribute > > > has_part_modified(OrganismPart) where OrganismPart is an > > > OntologyEntry. Note that Treatment has_protocol(Protocol) and > > > Modification has a modification_type. > > > > > > There are other examples but you get a sense for the complexity of > > > these terms that a name, value, source can't cover alone. Also, you > > > see the need to refer to other ontologies (e.g., compound) and objects > > > used by MAGE (e.g., Contact, Protocol). > > > > Having read the three examples that you presented it is clear to me > > that before we go much further we should move up a level in our > > discussion. I really think we need to discuss where the boundaries of > > the ontology and the MAGE model lie - where they overlap, and why. I > > believe that along with MIAME, the ontology and MAGE will be important > > contributions of MGED to the community, and I believe they should all > > coexist happily. It feels to me that we developed MAGE without paying > > too much attention (until just recently) to what was going on with the > > ontology project. > > > > I'm getting involved not because I want to prove how clever > > my ideas are, but because I want to see the ontology become something > > that's really useful for people. Two years ago I knew nothing about > > XML. One year ago I knew nothing about UML. Two months ago I knew > > nothing about ontologies. So at best, I'm an opinionated novice. > > > > I'm tossing in all these caveats because I think that there's been a > > lot of work put into this already, and I want to be careful about > > stepping on other people toes. But I think that unless we talk about > > these issues we're likely to start bumping our heads into one another. > > > > > > Let me set some ground rules that I think will help: > > > > * I think we should be willing to change MIAME, MAGE, or the ontology > > in order to make them work together in a synergistic manner. > > > > * I'm *not* convinced that MAGE is *the* correct way of modelling > > anything, and so if things are better done within the ontology and I > > can make MAGE smaller, I am *thrilled* to do so. > > > > * I believe that we should let ourselves be guided by 'use cases' - or > > ways that we anticipate these three pieces will be used by the > > community. If we have an important use case that we cannot support > > by the current model, we should change things. Likewise we shouldn't > > add new 'features' unless we can identify a use case that needs that > > feature. This just provides a common ground for everyone to > > understand what the motivations for our decisions are. I believe > > that MAGE would be much more understandable if we would document > > more of the use cases that we have explored over time. > > > > First, what does MAGE do? Three things: > > ======================================= > > > > 1) Describes an *object model* that enables us to build API's in Perl, > > Java, etc that tell us how to build computer data structures to > > hold the data > > > > 2) Describes a communication format MAGE-ML that enables us to encode > > the data in a text file that can be transfered over the internet. > > > > 3) Hopefully it will soon describe a relation DB mapping > > > > Second, how will MAGE get used? A couple of ways: > > ================================================= > > > > * Software to read MAGE-ML - converts it into programming language > > objects that store the data temporarily > > > > * Software that reads MIAME data from a database - likewise temporary > > storage of data into objects > > > > * Software that writes data to a MAGE database - like the MIAMExpress > > software > > > > * Software to write MAGE-ML - convert data into MAGE objects and the > > existing MAGE-ML writers can do the rest > > > > * Software to analyze microarray data - since it will likely want to > > read from and store to a MAGE DB > > > > Those are MAGE's high-level use cases. > > > > What does the ontology do? > > ========================== > > > > An ontology is a description of knowledge. It's a very broad thing, > > and it can become anything that you want to make of it. I'm not sure > > what you (or others) feel about this. At the very least I see it > > providing three things: > > > > 1) A list of properties that samples can have (taxon, cultivar, > > strain, organ type, tissue type, cell type, etc) > > > > 2) A list of values that each of those properties can have (for organ > > type: brain, heart, liver, etc) > > > > 3) Definitions for all of the properties and the values. > > > > I believe that this is the fundamental value of the ontology: by > > providing a concrete terminology we can assign meaning to our data > > that is unambiguous. And we can use those unambiguous meanings to > > determine whether two pieces of data (two samples) are similar or > > dis-similar. > > > > How will it be used? > > ==================== > > > > >From reading the ontology group home page, it seems that the primary > > mission of the group is develop the sample ontology as a way for > > researchers to annotate the samples used in their experiments. That > > helps MIAME, and it fits nicely with MAGE. Likewise I'm not sure what > > you (or others) feel about this. > > > > At least, I see two things: > > > > 1) In data transmitted by MAGE-ML, a researcher uses ontology terms to > > describe the samples used in the experiment > > > > 2) When making queries to a DB, a researcher uses ontology terms to > > find experiments of relevance > > > > Once again, there can be more, but these are the minimal ones: > > assigning terms to data, retrieving similar data and not dis-similar > > ones. > > > > How are MAGE and the Ontology similar? > > ====================================== > > > > At a high level, an object model and an ontology are the same thing: > > descriptions of knowledge and the relationships and properties of the > > information. An object model just describes it in a way that enables > > us to build programming language objects from the information. > > > > What I'd like to propose is: > > > > 1) that we let the object model (MAGE) handle the information that is > > variable and chaning > > > > 2) that we let the ontology handle the information that is invariant > > and unchanging. > > > > For example, 'Measurement' is an object with two components: a value > > and a unit. The value is a variable and therefore Measurement as an > > object is in the realm of MAGE, while unit which is invariant (meaning > > we can define all the units that we need - Chemistry Markup Language > > did it for us) and should belong to the ontology. > > > > Maybe this is an overly bold suggestion, but I understand how to use > > it within the MAGE-MIAME framework. Like GO it enables us to build an > > unambiguous system of information to describe our experiments from the > > side of the variable data: MAGE, and the invariant data: the > > ontology. Because the ontolgy data is invariant each concept can have > > a unique identifier in a central database, and an MAGE object that > > wants to reference an ontology entry can just use that ontology > > entry's identifier: e.g. MGED:100456. > > > > Right now, that is all that the MAGE OntologyEntry object provides: > > the ability to reference a database entry in some other DB, as well as > > a bit of extra information such as name and category so that it can be > > useful without performing the lookup. > > > > jas. - -- Dr. Martin Hofmann Executive Project Director LION bioscience AG Im Neuenheimer Feld 515 69120 Heidelberg Germany phone: (+49)-6221-4038-120 cell.: (+49)-(0)173-9303828 fax: (+49)-6221-4038-401 e-mail: martin.hofmann@lionbioscience.com www: http://www.lionbioscience.com ------------------------------ Date: Tue, 08 Jan 2002 13:23:53 +0000 From: Helen Parkinson Subject: Re: [microarray-ontol] Re: sex gender mating type Hi Martin, the case for a mammalian ontology is good but I have S.pombe data that is coming quite soon, there is a vast amount of yeast data and there are both C.elegans and fly data which will soon appear. Another advantage of considering all model organisms (including the mouse) is that controlled vocabs exist, and in some cases real ontologies which we can reference directly and these have been compiled by experts. There is not at present even a list of cell types for humans that we can reference. So I think that we should consider everyone, of course I am biased as an ex-Drosophila biologist, regards and hope to see you at MGED, Helen Martin Hofmann wrote: > > Dear All, > this discussion reminds me to the discussion that we had after the 2nd MGED > meeting, when the microarray ontology workgroup was established. There is > obviously a sort of "community"-specific use of terms and instances and > therefor I suggested (during the first meeting of the microarray ontology > workgroup) to avoid to try to "cover everything in one ontology". I think one > of the reasons why JAX is so successful with GXD and the connection to the > Edinburgh efforts to map expression to anatomy is that the community involved > is focussing on the mouse and only the mouse. > Actually, most of the concepts developed by the mouse community (especially > JAX) are currently being used as a sort of blueprint by people in the field of > human genetics. Therefor I think we should consider to split the microarray > ontology into domains that use identical (or at least similar) vocabs. To > construct a microarray ontology for mammalian species (in the first line) > should be much easier than paying a lot of attention to the exceptions that > protozoa and microbial organisms impose on a more "general" ontology. > > Any community (e.g. researchers working with microbial organisms; plants, > fungi etc.) can easily use the whole or parts of a microarray ontology > developed for mammalian species and "customize" it for use within "their" > community. As Helen already pointed out this is anyway going to happen as all > ontologies are being used (or not) and being further developed by users from > differnt knowledge domains. There is no ultimate / final / perfect ontology. > I know that the vast majority of microarray applications at present is done > using mammlian cells and organisms and therefor I suggest to solve the > problems of the majority of users first. The result will be that we will come > up faster with a working solution which will be used by 70 - 90 % of all > microarray users (said mammlian community). > > Best wishes to everybody and a Happy New Year! > > Martin > > Helen Parkinson wrote: > > > Hi Jason, > > > > a lot of good points here and a good overview. > > > > Chris has a MAML-MIAME-ontol mapping page, this needs > > updating to reflect the changes from MAML to MAGE. This > > would address the question of where the MAGE/ontol/MIAME > > boundaries lie. I am willing to have a go at this, would you > > like to help with the MAGE parts after I generate a rough > > draft, > > > > re: this part of your mail: > > ------------------------- > > An ontology is a description of knowledge. It's a very broad > > thing, > > > and it can become anything that you want to make of it. I'm not sure > > > what you (or others) feel about this. At the very least I see it > > > providing three things: > > > > > > 1) A list of properties that samples can have (taxon, cultivar, > > > strain, organ type, tissue type, cell type, etc) > > > > > > 2) A list of values that each of those properties can have (for organ > > > type: brain, heart, liver, etc) > > > > > > 3) Definitions for all of the properties and the values. > > > > All of the above is at least partially true but the problems > > lie with the values (instances) and how they are used, the > > instances that you have used as examples are the easy ones, > > when you get into for e.g. cell type the relationship > > between the instances is what is important. > > > > The discussion on mating type and sex has shown that > > different species use the same instance in different ways. > > So a simple list of values will not help, instances need to > > be structured, probably in a species specific way and the > > constraints will be complex. > > > > Also the ontology is not invariant and it will change as the > > ontology develops, I am thinking that when our data flow > > increases we may have to work to ontology releases as > > instances are added and terms will become redundant or > > obsolete as they do in the go ontology. It is not clear to > > me how to handle this easily, perhaps you have some ideas, > > > > cheers > > > > Helen > > > > "Jason E. Stewart" wrote: > > > > > > Hey Chris, > > > > > > I'm sending the reply to the list as I figure this is a generally > > > useful discussion. > > > > > > "Chris Stoeckert" writes: > > > > > > > >> I'm hoping to use the holidays to give me space to actually think > > > > >> about these issues. Angel filled me in on the discussion you had > > > > >> today. There are multiple examples beside age where a > > > > >> self-referential ontology entry would be needed and I hope to > > > > >> provide these to you next week. > > > > > > I'll put my comments after your examples. > > > > > > > Some other cases like age: > > > > BiosourceProvider has 3 attributes. One "biosource_type" is an > > > > enumerated list. The others are has_owner(Person) and > > > > has_donor(Organization). Person is a class with attributes: > > > > first_name, last_name, mid_initials, and > > > > has_organization(Organization). Organization has a name. Both > > > > Person and Organization are subclasses of Contact which has > > > > address, email, fax, toll_free_phone, phone, and has_URI(URI). > > > > > > > > EnvironmentalHistory is a class with subclasses such as > > > > CultureCondition which in turn has subclasses such as > > > > Nutrients. CultureCondition has the attribute > > > > has_measurement(Measurement) and Nutrients has the attribute > > > > nutrient_component(Compound). Compound is a subclass of > > > > OntologyEntry. > > > > > > > > Treatment is a class with subclasses such as > > > > Modification. Modification has a subclass SomaticModification > > > > which in turn has the attribute > > > > has_part_modified(OrganismPart) where OrganismPart is an > > > > OntologyEntry. Note that Treatment has_protocol(Protocol) and > > > > Modification has a modification_type. > > > > > > > > There are other examples but you get a sense for the complexity of > > > > these terms that a name, value, source can't cover alone. Also, you > > > > see the need to refer to other ontologies (e.g., compound) and objects > > > > used by MAGE (e.g., Contact, Protocol). > > > > > > Having read the three examples that you presented it is clear to me > > > that before we go much further we should move up a level in our > > > discussion. I really think we need to discuss where the boundaries of > > > the ontology and the MAGE model lie - where they overlap, and why. I > > > believe that along with MIAME, the ontology and MAGE will be important > > > contributions of MGED to the community, and I believe they should all > > > coexist happily. It feels to me that we developed MAGE without paying > > > too much attention (until just recently) to what was going on with the > > > ontology project. > > > > > > I'm getting involved not because I want to prove how clever > > > my ideas are, but because I want to see the ontology become something > > > that's really useful for people. Two years ago I knew nothing about > > > XML. One year ago I knew nothing about UML. Two months ago I knew > > > nothing about ontologies. So at best, I'm an opinionated novice. > > > > > > I'm tossing in all these caveats because I think that there's been a > > > lot of work put into this already, and I want to be careful about > > > stepping on other people toes. But I think that unless we talk about > > > these issues we're likely to start bumping our heads into one another. > > > > > > > > > Let me set some ground rules that I think will help: > > > > > > * I think we should be willing to change MIAME, MAGE, or the ontology > > > in order to make them work together in a synergistic manner. > > > > > > * I'm *not* convinced that MAGE is *the* correct way of modelling > > > anything, and so if things are better done within the ontology and I > > > can make MAGE smaller, I am *thrilled* to do so. > > > > > > * I believe that we should let ourselves be guided by 'use cases' - or > > > ways that we anticipate these three pieces will be used by the > > > community. If we have an important use case that we cannot support > > > by the current model, we should change things. Likewise we shouldn't > > > add new 'features' unless we can identify a use case that needs that > > > feature. This just provides a common ground for everyone to > > > understand what the motivations for our decisions are. I believe > > > that MAGE would be much more understandable if we would document > > > more of the use cases that we have explored over time. > > > > > > First, what does MAGE do? Three things: > > > ======================================= > > > > > > 1) Describes an *object model* that enables us to build API's in Perl, > > > Java, etc that tell us how to build computer data structures to > > > hold the data > > > > > > 2) Describes a communication format MAGE-ML that enables us to encode > > > the data in a text file that can be transfered over the internet. > > > > > > 3) Hopefully it will soon describe a relation DB mapping > > > > > > Second, how will MAGE get used? A couple of ways: > > > ================================================= > > > > > > * Software to read MAGE-ML - converts it into programming language > > > objects that store the data temporarily > > > > > > * Software that reads MIAME data from a database - likewise temporary > > > storage of data into objects > > > > > > * Software that writes data to a MAGE database - like the MIAMExpress > > > software > > > > > > * Software to write MAGE-ML - convert data into MAGE objects and the > > > existing MAGE-ML writers can do the rest > > > > > > * Software to analyze microarray data - since it will likely want to > > > read from and store to a MAGE DB > > > > > > Those are MAGE's high-level use cases. > > > > > > What does the ontology do? > > > ========================== > > > > > > An ontology is a description of knowledge. It's a very broad thing, > > > and it can become anything that you want to make of it. I'm not sure > > > what you (or others) feel about this. At the very least I see it > > > providing three things: > > > > > > 1) A list of properties that samples can have (taxon, cultivar, > > > strain, organ type, tissue type, cell type, etc) > > > > > > 2) A list of values that each of those properties can have (for organ > > > type: brain, heart, liver, etc) > > > > > > 3) Definitions for all of the properties and the values. > > > > > > I believe that this is the fundamental value of the ontology: by > > > providing a concrete terminology we can assign meaning to our data > > > that is unambiguous. And we can use those unambiguous meanings to > > > determine whether two pieces of data (two samples) are similar or > > > dis-similar. > > > > > > How will it be used? > > > ==================== > > > > > > >From reading the ontology group home page, it seems that the primary > > > mission of the group is develop the sample ontology as a way for > > > researchers to annotate the samples used in their experiments. That > > > helps MIAME, and it fits nicely with MAGE. Likewise I'm not sure what > > > you (or others) feel about this. > > > > > > At least, I see two things: > > > > > > 1) In data transmitted by MAGE-ML, a researcher uses ontology terms to > > > describe the samples used in the experiment > > > > > > 2) When making queries to a DB, a researcher uses ontology terms to > > > find experiments of relevance > > > > > > Once again, there can be more, but these are the minimal ones: > > > assigning terms to data, retrieving similar data and not dis-similar > > > ones. > > > > > > How are MAGE and the Ontology similar? > > > ====================================== > > > > > > At a high level, an object model and an ontology are the same thing: > > > descriptions of knowledge and the relationships and properties of the > > > information. An object model just describes it in a way that enables > > > us to build programming language objects from the information. > > > > > > What I'd like to propose is: > > > > > > 1) that we let the object model (MAGE) handle the information that is > > > variable and chaning > > > > > > 2) that we let the ontology handle the information that is invariant > > > and unchanging. > > > > > > For example, 'Measurement' is an object with two components: a value > > > and a unit. The value is a variable and therefore Measurement as an > > > object is in the realm of MAGE, while unit which is invariant (meaning > > > we can define all the units that we need - Chemistry Markup Language > > > did it for us) and should belong to the ontology. > > > > > > Maybe this is an overly bold suggestion, but I understand how to use > > > it within the MAGE-MIAME framework. Like GO it enables us to build an > > > unambiguous system of information to describe our experiments from the > > > side of the variable data: MAGE, and the invariant data: the > > > ontology. Because the ontolgy data is invariant each concept can have > > > a unique identifier in a central database, and an MAGE object that > > > wants to reference an ontology entry can just use that ontology > > > entry's identifier: e.g. MGED:100456. > > > > > > Right now, that is all that the MAGE OntologyEntry object provides: > > > the ability to reference a database entry in some other DB, as well as > > > a bit of extra information such as name and category so that it can be > > > useful without performing the lookup. > > > > > > jas. > > -- > Dr. Martin Hofmann > Executive Project Director > LION bioscience AG > Im Neuenheimer Feld 515 > 69120 Heidelberg > Germany > phone: (+49)-6221-4038-120 > cell.: (+49)-(0)173-9303828 > fax: (+49)-6221-4038-401 > e-mail: martin.hofmann@lionbioscience.com > www: http://www.lionbioscience.com ------------------------------ Date: 08 Jan 2002 09:03:25 -0700 From: jason@openinformatics.com (Jason E. Stewart) Subject: Re: [microarray-ontol] Re: sex gender mating type "Helen Parkinson" writes: > Chris has a MAML-MIAME-ontol mapping page, this needs updating to > reflect the changes from MAML to MAGE. This would address the > question of where the MAGE/ontol/MIAME boundaries lie. I am willing > to have a go at this, would you like to help with the MAGE parts > after I generate a rough draft, Yes, since I've stuck my foot in it, it would be useful if I agreed to actually *help* > All of the above is at least partially true but the problems lie > with the values (instances) and how they are used, the instances > that you have used as examples are the easy ones, when you get into > for e.g. cell type the relationship between the instances is what is > important. Important to whom? For MAGE, all that is important is that I can point you to where you get more information about the ontology term - I don't need to capture those relationships in the model. For the curator, the contextual relationships of the data is critical. Whether the structure of the ontology is a list, a tree, a DAG, or a graph matters little to me. The key to the issue for understanding how to combine MAGE and MSO (MGED Sample Ontology??) is that *currently* MAGE expects ontology terms to be identifiable via some unique key. That's all. They are treated exactly analogous to a SwissProt entry or a TrEMBL entry. When a scientist associates an ontology term with his data, MAGE needs to transmit that information, and the simplest way is to identify the database (ontology) to which the term belongs, and it's accession number (identifier) within that DB. Then when the data arrives wherever it's headed, say ArrayExpress, then it's up to y'all what to do with the ontology entries. You get to set the policy for *what* ontologies are used, and which aren't. Some you may want to store locally within the DB, some you won't and the ontology entries will just be accession numbers into foreign databases. The MAGE model for OntologyEntry includes a little bit of info about the term so that is slightly more than an accession number. That was so that the term could be useful without having the entire ontology stored locally. The discussion I've started is asking whether we want to make the MAGE concept of OntologyEntry more complex than it is so that we can capture more information than it currently does. My current position is we should keep the model simple. If some group wants to be able to capture more information about the ontology, they should keep a local copy of the ontology and handle it specially within their software and their DB. It's my guess that before we can actually get much further into this discussion, we're going to need: * example MAGE-ML files that use ontologies * software that can handle OntolgyEntry's in MAGE-ML data files * some for of query engine that can use OntologyEntry's in a DB query. These will provide some nice examples of where things work and where they don't. > Also the ontology is not invariant and it will change as the > ontology develops, I am thinking that when our data flow increases > we may have to work to ontology releases as instances are added and > terms will become redundant or obsolete as they do in the go > ontology. It is not clear to me how to handle this easily, perhaps > you have some ideas, I don't yet. I'm sure that we will be able to learn a lot from the experience of the GO user community as to how they've dealt with the same issues. jas. ------------------------------ Date: Tue, 08 Jan 2002 16:48:32 +0000 From: Helen Parkinson Subject: Re: [microarray-ontol] MAGE ontology list Hi Jason, I have been looking @ this list to try and add a few terms, re: the ExperimentalFactor this has two categories biological methodological but the e.g. is differing protocols, should this be linked to protocol/protcol application then? or amybe you just intended it to be simple: extract_protocol,labelling_protocol,hyb_protocol or have I missed the point? just got an example where the difference was a protocol and it's not clear to me if it should be expressed here and in the places where protocol would normally be referenced, cheers glad you're *helping* sadly this is the price you pay for expressing an interest ;-) helen "Jason E. Stewart" wrote: > > Hey All, > > Here is the list of all OntologyEntries that exist in MAGE, > categorized by how complex I think they will be to create. There are > only two that I think will take much effort, and you're already > working on the big one. > > jas. > > -- > Finished: > ========= > BioSequence:Species > DesignElementGroup:Species > > Trivial: List > ============= > Image:Format {TIFF,JPG} > BioSequence:PolymerType {DNA,RNA,protein} > NodeValue:DataType {float,int,string} > Parameter:DataType {float,int,string} > QuantitationType:DataType {float,int,string} > NodeValue:Scale {log_2,log_10} > QuantitationType:Scale {log_2,log_10} > > OpenEnded: List > ============= > Contact:Roles {experiment_provider,software_provider} > BioSequence:OntologyEntries > Reporter:FailTypes > Reporter:WarningTypes > DesignElement:ControlType > Description:Annotations > > Medium: List > ============= > DerivedBioAssay:Type {replicate_average} > PhysicalArrayDesign:SurfaceType > ArrayGroup:SubstrateType > BioSequence:Type {exon,intron,gene} > DesignElementGroup:Type > BioMaterial:MaterialType > BioSample:Type > DatabaseEntry:Type > BibliographicReference:Parameters {publisher_address} > ExperimentDesign:Types {heat_shock,osmotic_shock,time_series} > ExperimentalFactor:Category {biological_factor,methodological_factor} > Protocol:Type {cDNA_labelling,RNA_extraction} > Hardware:Type {two_color_scanner,ceramic_pin_arrayer,computer,thermocycler} > Software:Type {feature_extraction,spot_finding} > Compound:MerckIndex => Compound:Indices {Merck,CAS} > > Hard: DAG > ========= > Treatment:Action {wait,centrifuge,add} > > Impossible: Object > ================== > BioSource:Characteristics ------------------------------ Date: Tue, 8 Jan 2002 08:55:26 -0800 From: "Miller, Michael" Subject: RE: [microarray-ontol] Re: sex gender mating type All, I very much agree with Jason on the relationship between MAGE and ontologies, that it allows referencing an ontology but doesn't attempt to capture more than is necessary to then go to the ontology itself. I do differ in that I think we need to provide a little more structure in MAGE: > Whether the structure of the ontology is a list, a tree, a DAG, or a > graph matters little to me. The key to the issue for understanding how > to combine MAGE and MSO (MGED Sample Ontology??) is that *currently* > MAGE expects ontology terms to be identifiable via some unique > key. That's all. They are treated exactly analogous to a SwissProt > entry or a TrEMBL entry. Here's where I need some help in understanding. I think from the discussion that for a particular context, such as characteristics for a cell line, it is necessary to communicate the relationship of the ontology entries to themselves in order to understand how they go together for that particular instance of a cell line, a simple listing of the entries isn't rich enough. Key word here being an "instance". If that's true, then I do think we need to make OntologyEntry richer for MAGE. To sum up, I think the requirement for MAGE is that it allows the specification of an instance of OntologyEntry. I think the mention of use cases is exactly how to determine this. Michael > -----Original Message----- > From: jason@openinformatics.com [mailto:jason@openinformatics.com] > Sent: Tuesday, January 08, 2002 8:03 AM > To: ontol > Subject: Re: [microarray-ontol] Re: sex gender mating type > > ------------------------------ Date: 15 Jan 2002 22:40:22 -0700 From: jason@openinformatics.com (Jason E. Stewart) Subject: [microarray-ontol] New WWW site Hey All, just wanted you to know that I've given the ontology it's own space at mged.sf.net. It really doesn't have any info yet, but I can put it up as soon as someone comes up with content ... jas. ------------------------------ End of microarray-ontol-digest V1 #18 *************************************