Introduction
Information search is one of the most popular applications with significant room for improvement. The availability of large amounts of structured, machine understandable information on the semantic web offers opportunities for improving traditional search. The Resource Description Framework (RDF) is a powerful data model that it the core of W3C's Semantic Web architectural layers. It is a standard that provides the features for interoperability of data & machine understandable semantics for metadata. There exist several RDF query languages including RQL, RDQL, SeRQL, TRIPLE and SPARQL. However most real world searches, as done by common users, involve queries represented in natural language, such as English, that they are familiar with. This allows for users to express their information needs without the knowledge of the underlying schema or vocabulary of the ontologies.
The problem of natural language interfaces to knowledge bases has been extensively studied for years. [DISCOVER, BANKS, XKeyword] allow for keyword search over relational databases. [XRank, Nalix] provide a natural language interface to search over XML. Nalix uses the tree structure of XML in translating the search keywords into XQuery expressions. This work presents a keyword search interface to querying an RDF ontology which accepts a query expressed using keywords in natural language and outputs relevant logical subgraph units from the RDF graph.
System Architecture
Related Work
- Semantic Wikipedia Query
- BLINKS: Ranked Keyword Searches on Graphs
- PANTO: A Portable Natural Language Interface to Ontologies
- The CompleteSearch Engine: Interactive, Efficient, and Towards IR & DB Integration
- Integrating DB and IR Technologies: What is the Sound of One Hand Clapping?
- The QUIQ Engine: A Hybrid IRDB System
- Sparq2l: Towards Support for Subgraph Extraction Queries in RDF Databases (LSDIS, UGA)
- SemRank: Ranking complex relationship search results on the semantic web (LSDIS, UGA)
- NaLIX: an Interactive Natural Language Interface for Querying XML
- Discover: Keyword Search in Relational Databases
- XKeyword: Efficient IR-style Keyword Search over Relational Databases
- Kowari: A platform for Semantic web storage and Analysis
- Yet Another RDF Store (YARS): Optimized Index Structures for Querying RDF from the Web
- Swoogle: Semantic Web Search Engine
- Conceptual Resource Search Engine: Querying the Semantic Web with Corese Search Engine
- BANKS: Browsing and Keyword Searching in Relational Databases
- XRANK: Ranked keyword search over XML documents
- QuizRDF: Search technology for the semantic Web
- SPARQLeR: Extended Sparql for Semantic Association Discovery (LSDIS, UGA))
Motivating Scenario
Query: How is the American Civil War related to the Writing of Thucydides?
Datasets
- LSDIS lab ontology portal
- DBpedia.org: Querying Wikipedia like a Database
- Yago: A Core of Semantic Knowledge Unifying WordNet and Wikipedia
- Mooney Queries
Misc Links
- Apache Lucene Search Project
- Good introduction to using Lucene search index -- here
- Apache Lucene Scoring -- here
- Apache Lucene: Class Similarity -- here
Recent Updates :
- Accept comma separated keyword input
- Remove duplicate paths from output
Thesis
- Thesis write up (PDF)
- Thesis write up (DOC)
- Thesis presentation slides
- Technical Report: This is a report based on Sujeeth Thirumalai's MS thesis; Keyword Search Interface for Path Query on Ontology; (Advisor: Prof. Amit Sheth); Computer Science Department, University of Georgia, Athens, July 21, 2007.