LSDIS > Projects > SAI

Semantic Association Identification and Knowledge Discovery for National Security Applications

Project Summary

Role of information technology (IT) is recognized to be a critical component in the effort of improving national security, including homeland defense. Applications of importance to national security, such as aviation security, pose significant challenges to current information technology and provide excellent source for further research in developing next generation IT solutions.

Recently, there is significant advance in applying techniques from database and information systems, knowledge representation, AI, information retrieval including text categorization, lexical and language analysis and others in developing a new generation of semantic technologies.  Semantic technologies help in associating meaning of data and in more meaningfully organizing data, in meaningfully correlating data, as well as in converting data into information for more effective decision making and in finding information that contextually relevant to users’ needs.  They help with syntactic and representational as well as semantic interoperability. This general area of research is also getting renewed attention now that there is considerable excitement in the vision of the Semantic Web, characterized as the next phase of the Web.

Results from several of our past and continuing research projects have led to the development a semantic technology called Semantic Content Organization and Retrieval Engine (SCORE). Using SCORE’s ability to quickly create ontology-driven agents without programming, it has been possible to (a) quickly create and maintain large knowledge bases (such as over one million entities and relationships per domain) base from multiple semi structured and structured sources of knowledge in largely (but not fully) automated ways, and (b) ability to create semantic (domain specific) metadata from unstructured (text), semi structured and structured sources of static and dynamic (e.g., query driven) content. This technology has also been commercialized and is being used in aviation security and intelligence applications. While specifics of these applications cannot be discussed due to government and agency regulations, and many technologically possible capabilities have yet to pass through policy considerations, we imagine a prototype application of homeland security interest that help in identifying and screening a passenger with respect to security risk to develop requirements for relevant IT research. Two important challenges posed by such an application include (a) rapid identification of semantic associations involving entities (such as a passenger or a group of passengers on a flight), and (b) knowledge discovery that identify semantic associations of interest (such as those that may pose a risk).

Our goal is to research new techniques and improving effectiveness of techniques to identify semantic associations and knowledge discovery by exploiting a large knowledge base. Specific objectives include (a) ontology driven lazy semantic metadata extraction (i.e., annotation) to complement traditional active metadata extraction techniques, and (c) formal modeling and high-performance computation of semantic  association discovery including ontology-based contextual processing and relevancy ranking of interesting relationships. Our approach involves bootstrapping earlier research on semantic metadata extraction, multi-ontology query processing and other tools from on-going InfoQuilt project so that we can create knowledge bases and metadata from publicly available sources to enable meaningful evaluation of the techniques.

Project Description, Project Plan, Yr1-Report (detailed),  Yr1-Report (short), internal documents [password needed]

People

Publications:

Related Presentations:

Demos:


This material is based upon work supported by the National Science Foundation under Grant No. 0219649. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.