Nelson Rushton describes NewsExtract in his assignment number 1. According to Rushton, the developer of this system, Erik T. Mueller, claims that XMLNews is not sufficient.
XMLNews is a subset of NTIF a text standard using html to provide some metadata.
The review explains why NewsML is not sufficient using an example. It becomes clear that many details cannot be expressed in a machine understandable way. There simply arenít enough tags. NewsExtract provides an ontology introducing a lot more concepts.
Some examples for that are given in the review.
According to the review NewsExtract also includes a natural language parser. Thus the system also works which the data and not just the metadata.††††
Looking at the , on which Erik T. Mueller describes NewsExtract, it became clear to me that NewsExtract uses the ontology to tag text files automatically using natural language processing techniques.
Nelson Rushton mentions, that human proof reading of the automatically created file is still required, but that the author of the system claims that NewsExtract offers a great speed-up.