![]() |
|||||||||
|
|||||||||
ADEPTFor the Alexandria Digital Earth Prototype (ADEPT Project), it has been proposed that UGA will play lead role in Iscape (information landscape) construction, involving the following activities:
To achieve these objectives, we have identified certain focus areas that we need to concentrate on:
Metabase for GIS Metadata consist of information that characterizes data. Metadata are used to provide documentation for data products. In essence, metadata answer who, what, when, where, why, and how about every facet of the data that are being documented. In the context of geospatial digital data, metadata is the information which describes the content, quality, condition, and other appropriate characteristics of the data. The FGDC standard is designed to describe all possible geospatial data. DataSets DataSets are the basic entities that the system deals with. Each dataset is described by a set of metadata and is assigned a unique dataset-id when it is entered in the metabase. No choice of supported metadata can satisfy the needs of all conceivable GIS applications (though FGDC metadata specification is quite comprehensive). In our system we choose a number of attributes that we think describe a particular dataset well enough to support most important queries and for which many sources provide the values. If a certain attribute should not be supported by a particular extractor, a default value has to be assumed. Implementation of Extractors One problem with extracting metadata from Web pages is that the different Web sites providing metadata differ widely in their structure and also in the metadata that they provide. This is in a way desirable as it provides a way of differentiating between the sites and retrieving information from a diverse range of information sources. However, since a standard way of describing metadata has not yet been developed, this means that specialized extractors are needed that would allow for retrieval of metadata to be used for attribute-based queries. Even though extractors are designed to be as generic as possible, they might have to be changed if the Web site changes its structure. The change required can be minimal or can take considerable effort depending on the extent to which the Web sites structure changes. The extractors go to different Web sites, analyze the contents of the HTML pages, process the contents and return the relevant metadata in the form of XML. In the future, it is possible that with the advances in XML, the need for specialized extractors may become obsolete. It is also possible that in the future, extractors may be implemented as agents that are stand-alone versions on client or server sides. ADEPT investigates information requests built upon the concept of an ISCAPE (Information Landscape). Here is our ISCAPE working definition: ISCAPE (Information Landscape) is a collection of semantically related information assets, along with the ways to analyze and visualize them, that facilitate learning about the Digital Earth. These information assets:
Iscapes are designed to correlate information across the Internet. Iscapes are dynamic in that the information they lead to is not a single, hard-coded Web page but rather a collection of related datasets that is generated at runtime. It can therefore be considered an information request that can be expressed as a combination of keyword, attribute, and content based search. Below is a sample:
We will be using RDF (Resource Description Format) as the framework for constructing Iscapes. RDF uses XML as its underlying syntactic model. The attribute search part of an Iscape is constructed by describing involved entities taken from different domain ontologies. Design and implementation of ontologies In Iscapes the terms that describe entities can be ambiguous - e.g. "cricket" means something to a cricket (game) fan as compared to a biology professor. In order to eliminate ambiguity, information beyond the names of the correlated terms is required. Ontologies provide such additional information. Ontologies are used to describe not only the datasets we are interested in, but also the retrieved datasets, which are of a heterogeneous nature. One set of query may contain maps, real videos and Word Perfect documents. The agent (display agent) must be able to distinguish between the different dataset types to display them in an adequate manner and act properly when the user retrieves them. A classification of all possible datasets not only supports this behavior but has also additional advantages. Advanced attribute queries can be made such as "retrieve only images", and the results can be displayed in group of related dataset types. Multiple ontologies that are topic specific will be created to describe the whole model. We have chosen the Resource Description Format (RDF) Schema for defining ontologies. A graphical ontology designer tool (already created by a third party) may be provided to the domain experts to design their own ontologies. Agent Architecture This project uses a multi-agent system to process the Iscape. Six agent types are involved in this task: the User Agent, one or more Broker Agents, an Ontology Agent, a Query-Planning Agent, and possibly many Resource Agents. |
|||||||||
|
|
©2005 LSDIS and the University of Georgia. All rights reserved. Large Scale Distributed Information Systems |
|
|||||||
|
|
|
|
|
||||||