Semantics for the Real World and the Natural World in the Model World


Amit Sheth
LSDIS lab, Computer Science, University of Georgia and Semagix Inc.


Panel statement for BISC 2005 panel on "FLINT and Semantic Web"
moderators: Elie Sanchez & Masoud Nikravesh)


The Web is made up of vast variety of resources, pages and documents, databases and services. Majority of the data is unstructured text, and increasing amount of data is non-textual digital media. Making a sense out of such data mean we need to understand implicit semantics and create knowledge representation involving formal semantics such that what we model or represent is machine processable. William Woods made a very forceful and relevant observation in this context [Wood]: "Over time, many people have responded to the need for increased rigor in knowledge presentation by turning to first-order logic as a semantic criterion". This distresses me, since it is already clear that first-order logic is insufficient to deal with many semantic problems inherent in understanding natural language as well as the semantic requirements of a reasoning system for an intelligent agent using knowledge to interact with the world." Lotfi Zadeh has similarly make convincing arguments in the context of Question Answering.


Similar challenges are faced when dealing with inherently complex natural world, as in biological objects, processes and phenomena, where many observations are inherently probabilistic or/and fuzzy. The identification and quantification of biological entities lies at the heart of the '-omics' family of disciplines like genomics, proteomics or glycomics. In many instances, the current analytical techniques (like mass spectrometry) cannot quantify the presence of related (perhaps by mass) biological objects by respective categories. For example, a particular sugar attaches at one of two potential positions. It is possible that the sugar attaches at one or the other position or in both the positions and the identification and quantification of the amount of each category of biological entity (by position of the sugar) may not be possible [Glycomics]. Hence, representation of such non-bivalent instances will prove to be the touchstone for the success of Semantic Web technology in one of the most important area of application. A key observation made by bioinformatics researchers is the distinction between the approaches of biological sciences and computational sciences; i.e. approximations (derived from statictical methods) are often used in biology instead of exact values and a corresponding representation framework in the Semantic Web is needed for Semantic Web to be applied to biology. If we need to model this world, we must capture the complexity that the problem poses;forcing upon a crisp or bivalent logic to model it would not lead to adequate modeling and hence will not lead to an acceptable solution. Another use case is the modeling of social networking, and what is terms as semantic web (with small "s"), and the associated work in emergent and ambient semantics.


With a view of the above use cases, we believe that the Semantic Web cannot afford to limit itself to RDF/S and OWL. Probabilistic and fuzzy logic as well as hybrid modeling and reasoning strategies will be necessary if our intension is to solve real problems, such as those scientists are working to address. In this context, we have challenged the community to develop a framework that spans implicit, formal and powerful semantics [Sheth].


[Wood] Woods, W., Meaning and Links: a Semantic Odyssey, KR 2004, 740-742.


[Sheth] Sheth, A. et al.. Semantics for the Semantic Web: The Implicit, the Formal and the Powerful. Intl. Journal on Semantic Web and Information Systems 1(1), 2005, 1-18.