Semantic Web Technology Evaluation Ontology

Project: Semantic Discovery: Discovering Complex Relationships in Semantic Web

The emergent Semantic Web community needs a common infrastructure for testing the scalability of new developments in software that makes use of machine processable data. One particular need is to have a large, high quality test ontology from which various ontology management tools can assess and test their scalability and other properties. Considering that there are somewhere between 20 to 50 ontology tools alone, the question arises: how do we compare them?

Of particular interest is not just the schema of the ontology, but also the population (instances, assertions or description base) of the ontology. A populated ontology (ontology with instances or assertions) is critical for core semantic issues such as semantic disambiguation as well as being necessary for checking the scalability of tools and techniques, including e.g. reasoning techniques. An ontology of real-world scale is needed to build benchmarks for evaluating and comparing tools and techniques.

An advantage of having access to both the schema of the ontology and the population of the ontology is that a class name can be understood by looking not only at the name of the class, but also by looking at the different instances that belong to that class (a class name by itself can be interpreted in different ways by different people).

Many real word ontology have tens to few hundreds of classes and over one million objects (instances). An iterative process will be used to periodically extend SWEATO description (schema) and description base (instances, assertions). By December 2003 end, we expect to have extracted at least 100,000 instances for SWEATO v.1, and by end of 1Q 2004, we expect to have at least 1 million instances for SWEATO v.2.

A Semantic Web Technology Evaluation Ontology (SWEATO) will serve the above purpose. It will be created (semi-)automatically by

  1. designing the SWETO schema using an ontology design toolkit
  2. identifying knowledge sources that can be used to populate parts of SWETO
  3. utilizing extractor agents (written by humans without programming using a toolkit) to periodically and automatically extract parts of knowledge from various open and public sources,
  4. semi-automatically disambiguating the knowledge (with limited human involvement)
  5. integrating related knowledge to populate the SWEATO
  6. providing ability to export SWEATO in RDF/RDFS and a version of OWL.

Semagix Freedom Toolkit will be used as the primary basis of the technology, suitably extending the results for standards compliance. All technical work will be done by the LSDIS Lab personnel working on the SAI and SemDIS projects. Semagix Freedom is based on licensing a technology developed at the LSDIS Lab and is available to the lab for such uses. W3Cís SW Activity will help in securing knowledge sources (item 2 above). This ontology will be made available through W3C Semantic Activity for use of its active community members. We are looking at further funding resources to extend and continued maintenance of this ontology.

Title: Semantic Web Technology Evaluation Ontology, Version 1.0: Reference Description
Date Issued: 2003-12-02
Supersedes: N/A
Visualization: Testbed
Latest version:
Status of document: This is the pre-release of the 'SemDis Testbed'
Description of document: This document is the reference description, version 1.0 of the SemDis Testbed. LSDIS Lab, Computer Science Department, University of Georgia
Ontology Schema (OWL): testbed_1_0.owl
Ontology Schema: Textual Description

This material is based upon work supported by the National Science Foundation under Grant No. IIS-0325464 titled "SemDis: Discovering Complex Relationships in Semantic Web". Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

