LSDIS > Projects > Active Semantic Documents

Active Semantic Documents


Reduce medical errors, improve physician efficiency and improve patient safety and satisfaction in medical practice. Use Semantic Web technology to develop an ontology driven process involving automatic semantic annotation of documents, and rule processing to achieve these goals.

Technical Vision and Background

Active Semantic Documents (ASD) are documents (typically in XML based format). ASDs are semantic since they are semantically annotated using one or more relevant ontologies which provide the nomenclature and conceptual model for interpreting and reasoning with the concept, and optionally annotated using lexically significant concepts and phrases (hence providing weaker semantics than the concepts and phrases that are annotated with and interpreted with respect to ontologies). ASDs are active because they support automatic and dynamic validation and decision making on the content of the document (typically by executing rules on semantic and lexical annotations). ASDs are displayed using a Web-based interface, and provide the ability to modify semantic and lexical components of its content in an ontology-supported and otherwise constrained manner (such as through lists, bags of terms, specialized reference sources, or a thesaurus or lexical reference system such as WordNet).

Project Description

The LSDIS lab's collaborative research project on Active Semantic Electronic Patient Record with the Athens Heart Center (AHC) exemplifies an implementation of ASDs in a healthcare (more specifically cardiology practice) environment. It has so far involved:

  • the development of populated ontologies in the healthcare (specially cardiology practice) domain
  • the development of an annotation tool that utilizes the developed ontologies for annotation of patient records, and
  • the development of decision support algorithms that support rule and ontology based checking/validation and evaluation.
ASEMRs have been implemented as an enhancement of AHC's Panacea electronic medical management system. Panacea is a web-based, end-to-end medical records and management system. This has enhanced the collaborative environment, and has provided insights into the components of electronic medical records, and the kinds of data available in these systems.

In the first phase of this project, we have designed, developed, and populated the following ontologies:
  • Practice Ontology: includes concepts such as practitioners, patients, insurance, facilities, etc. AHC's database was the primary source for populating this ontology.
  • Drug Ontology: includes concepts such as indications, interactions, formulary, etc. License content equivalent to physican's drug reference was the primary source for populating this ontology.
  • Diagnosis/Procedure Ontology: includes concepts such as medical conditions, treatment, ICD-9, CPT, etc. The current version of the population ontology utilizes licensed diagnosis/procedure related data, as well as information available from the Georgia Medicare website. Licensed SNOMED content is being used in the development of the next version of the populated ontology.
Medical records of patients are automatically annotated using the above ontologies and are displayed in a browser. Drugs, allergies, physicians and facilities (e.g., physicians or facilities the patient is referred to), treatments, diagnosis, etc. are automatically annotated. The physician has the ability to pull up a contextual list or even a visual subset of the relevant ontology and pick alternative choices. In some cases, alternatives are provided in ranked order list (e.g., other physicians with the same speciality in the same area and accepting the same insurance as the patient).

ASEMRs support active features by executing relevant rules over semantic annotations to support the following initial sets of capabilities:
  • drug-drug interaction check,
  • drug formulary check (e.g., whether the drug is covered by the insurance company of the patient, and if not what the alternative drugs in the same class of drug are),
  • drug dosage range check,
  • drug-allergy interaction check,
  • ICD-9 annotations choice for the physician to validate and choose the best possible code for the treatment type, and
  • preferred drug recommendation based on drug and patient insurance information
See example unannotated xml.
See example annotated xml.

Note: Style sheets have been applied. Right click to view source. Style sheets created for IE only.


  • The most important benefit we seek from ASEMR (with its proactive semantic annotations and rule-based evaluation) is the reduction of medical errors that could occur as an oversight.
  • Checks such as preferred drug recommendations lead to prescription drug savings for patients leading to improved satisfaction.
  • Assistance in choosing the medically appropriate ICD-9 code could lead to less communications with the insurer and faster payment.


This system is currently transitioning to an operational deployment at AHC (expected by October 2005). The system supports comprehensive management facilities such that ontologies can be automatically populated or updated by running knowledge extractors over changed sources.

Implementation Details at AHC

AHC's Panacea can produce an xml representation of the patients record. This document contains all the patient's demographics and insurance information along with a information about a particular visit. Information about a particular visit includes a textual description of past and present problems, drugs that the that patient was on before the visit, drugs the patient is on after the visit, and a textual description of the visit in natural language based on a regular language.

The document that is created from AHC's Panacea software is passed to the annotation tool via a web service. The annotation tool through a mixture of xpath and rdql annotates concepts from each of the ontologies in turn. In the context of ASDs annotation means the introduction of new tags and data to the existing xml to work as metadata and point back a concept in a particular ontology. The next step is to make the lexical annotations. Again new xml tags are introduced to mark these annotations. Since parts of the xml produced by AHCs Panacea are based on a regular language these annotations are more powerful than the simple string matching lexical annotations.

After the document has been annotated rules are applied to the document. These rules can come in the form of the existence or the absence of a relationship in an ontology and can span multiple ontologies. If a rule is broken another annotation is added. For example, an 'interaction' relationship should not exist between two drugs or there should be a 'covered' between a drug in a patients insurance (this rule involves the drug and practice ontology).

The new xml document full of annotations is then passes back to AHC's Panacea. Panacea then applies an xml style sheet which allows the xml to be shown in the user's browser. Annotation are shown while highlighting and the addition of symbols. Javascript and asynchronous javascript calls are used to allow the user to change the document.

Click To See Picture

We plan to measure the effectiveness of the system in operational setting. Example of measurements include number of alerts by type (e.g., drug interactions), number of changes made by physician, physician's view of time saved or increased in using the system, improvement in billing payments due to any change and validation of ICD9 codes, etc. First set of evaluation results is expected by December 2005.


This is a collaborative project lead by Dr. Subodh Agrawal, MD and Prof. Amit Sheth. Dr. Agrawal is an interventional cardiologist and heads the Athens Heart Center (AHC), Athens, GA's biggest cardiology practice. He provided the medical vision for this effort. Prof. Sheth is a researcher, educator and entrepreneur, who directs the Large Scale Distributed Systems (LSDIS) lab at the University of Georgia. He provided the technical vision and leadership for this work. Dr. Tripp Wingate, MD, provided additional insights and requirements for this collaboration.

Work at LSDIS was coordinated by Dev Palaniswami, a research scientist. Work at AHC was coordinated by Shyam Prabhakar, IT Manager. Jon Lathem of AHC and LSDIS lab implemented annotation and rule processing components. Cory Henson of LSDIS lab developed the drug ontology. Matthew Eavenson of LSDIS developed the diagnosis/procedure ontology. Jon and Shyam developed the practice ontology.

This work is an application of Active Semantic Documents project at the LSDIS lab. We acknowledge the use of Semagix Freedom for populating ontologies.

A part of the following presentation covers this project. Semantic Web & Semantic Web Services: Applications in Healthcare and Scientific Research, a keynote talk by Professor Amit Sheth given at the IFIP Working Conference on Industrial Applications of Semantic Web (IASW - 2005), Jyväskylä, Finland, August 25-27, 2005.