550 Home

Objectives

Schedule

Assignments

Grading

BHI Home Page

 

Knowledge Representation & Applications
(Mostly Biomedical applications)

MEBI 550, Winter, '05

Exercise #1: Building an Ontology (in Protégé)

Due Wed, Jan 12th, 7pm

The main objective for this assignment is to get some hands-on experience as an "ontology builder", and experience some of the design choices ontology builders face. Secondarily, you will gain familiarity with the Protégé ontology & knowledge base development environment.

As described in class, there are at least two common reasons for creating an ontology: (1) knowledge sharing, and (2) reasoning, which requires representing the information or knowledge in a formal syntax and semantics. For this assignment, I would like you to create an ontology of knowledge about cancer. (No, I'm not expecting you to cure cancer in a week.)

Constraints:

  1. All information in your ontology must be based on information from the web site http://www.nci.nih.gov/cancertopics/, one of the subpages under www.nci.nih.gov. I've chosen this web site because it is a good one--one side effect of this exercise is that you should learn more about this disease as you build your ontology (or at least more about what NCI says about the disease...)
  2. Your Protégé ontology must have at least 35 frames (and no more than 100 frames!)
  3. You must try to capture some of the breadth of knowledge on that web page -- note that there are nine (9) major headings to that page. You need not capture the organization into these particular 9 headings, but try to capture some of the diversity. (You may want to skip over the headings labeled "Cancer terminology resources" & "Cancer literature".)
  4. You can choose whatever type of depth of knowledge that interests you. The interesting thing about cancer information is that it quickly spans all three of our domain application areas--there is information at this web site that has a bio-informatics flavor, some stuff that is public-health oriented, and of course, plenty of clinical informatics. You may also choose to explore depth in a particular type of cancer. Of course, you may leave the .nci.gov domain in your quest for depth, but all knowledge in your ontology must be tied back to your higher-level organization of the breadth of knowledge about cancer.

There are no contraints on the type of ontology you build nor how you organize the knowledge about cancer. I encourage you to think imaginitively about the knowledge on the web pages. You may re-structure, and add to this information as you see fit. Try to use a variety of Protege features (see grading criteria - at this stage, you can't be "wrong"!) In general, you should imagine that you are building your ontology of cancer information either because (a) other systems need programmatic access to the knowledge stored at the cancer.gov web site (the NCI wants to encourage knowledge sharing), or (b) other systems will query the knowlege, and will perform some reasoning over the formal knowledge in your ontology. See also deliverable #2:

Deliverables:

  1. A Protégé ontology (usually a set of three files -- .pont, .pprj and .pins). Please either email these to me, or make them available at a web site for download. The latter is prefereble if the files are large.
  2. Some information about how you might expect this ontology to be used and any significant choices you made in your ontology design. This can be simply a few sentences--certainly no more than a page of text.

Note: I will be displaying some of your ontologies in class on Monday, April 5th. (Thus, I'm asking to have it by 7pm Wed, so that I can briefly pre-browse them before class.) One of the points of this exercise will be to demonstrate the diversity of design choices. Thus, please work on this assignment on your own, and be willing to answer questions about your ontology in class.

Protégé is available at www.stanford.protege.edu. Please use version 3.0, the "beta release". Despite the "beta" label, this is fairly stable, and I expect it to become the official release sometime during this quarter. Note that Protégé can be configured with a wide variety of additional "plug-in" components. If curious, feel free to try some of these, perhaps especially the visualization plug-ins (but at this stage, I would warn *against* trying the OWL plug-in.)

Grading criteria:

This is almost a freebie -- Since this is due in week 2 of the quarter, you will not be graded on the particular content or design choices in your ontology. I will grade on perceived level of effort. I will generally give one of three grades: 4.0 for those assignments where student put an appropriate amount of thought and work, 3.0 for assignment that seemed to satisfy only the bare minimum requirements, and where the student does not show a significant level of thought or work, 2.0 or lower for assignments that do not meet the requirements.

If this assignment is late, it will be significantly down-graded -- a 4.0 is not possible for late assignments.

 

Last Updated:
Jan 2, '05

Contact the instructor at: gennari@u.washington.edu