Semantic Content Management for Enterprises
and National Security


Amit Sheth

Director, LSDIS Lab, Computer Science, University of Georgia

CTO, Voquette, Inc.


Researchers in diverse areas have studied semantics for a long time.We have seen a steady progress from syntax, to representation and structure, and to semantics, in the ways we approach and solve the challenges of finding, integrating and using information of diverse types and from diverse sources. Now, the vision of Semantic Web juxtaposes semantics and the Web, and forces us to simultaneously deal with the complexity of modeling, reasoning and perceptions to support semantics, with the huge scale and heterogeneity of all imaginable kind needed to deal with the Web, large Enterprise portals, or anything of interest to the National Security. Before Semantic Web becomes a reality, we have started to see applications of semantic technology at the Enterprise and Industry levels, enabling what we call Semantic Enterprises, as well as to the challenging problems of National Security.


We will discuss an example emerging semantic technology called the Semantic Content Organization and Retrieval Engine (SCORE) based on research at the University of Georgiaís LSDIS lab. Novel and challenging real-world Semantic Applications for Enterprises and National Security will also be presented.


Core capabilities of SCORE include

         Aggregation of Unstructured, Semi-structured and Structured data in both, push (e.g., content feed) and pull (e.g., focused site crawling or database access) modes

         Analysis for Normalizing and Organizing Content (including Automatic Classification)

         Automatic and Semi-automatic Semantic Metadata Generation/Annotation

         Automatic maintenance of knowledge-base and metadata

         High performance, Scalable and Robust Semantic Engine with API for building Semantic Applications


Technologies used by SCORE include creation of knowledge and metadata extraction agents from unstructured and semi-structured content by non-programmers, use of domain specific ontologies, distributed agent execution, classifier committee involving classifiers based on multiple techniques (including several machine-learning and knowledge-base techniques), and Semantic Engine using main-memory based multi-level, incremental and distributed indexing.

Amit Sheth is the director of Large Scale Distributed Information Systems Lab at the University of Georgia and a Professor of Computer Science.In 1999, he founded Taalee, Inc. and managed it as its CEO. Since its merger with Voquette Inc. in June 2001 he has served as CTO. Prior to joining UGA in 1994, he served at Honeywell, Unisys and Bellcore.He is recognized for his work in federated database systems, semantic heterogeneity and semantic interoperability in distributed information systems, and workflow management. SCORE is the third major commercialization of his research. He has given 10 conference/workshop keynotes and over 120 invited/colloquium talks, and is among the best-cited authors in database/information systems literature.