org.openide.xml
Class XMLUtil

java.lang.Object
  extended by org.openide.xml.XMLUtil

public final class XMLUtil
extends java.lang.Object

Utility class collecting library methods related to XML processing.

Remember that when parsing XML files you often want to set an explicit entity resolver. For example, consider a file such as this:

 <?xml version="1.0" encoding="UTF-8"?>
 <!DOCTYPE root PUBLIC "-//NetBeans//DTD Foo 1.0//EN" "http://www.netbeans.org/dtds/foo-1_0.dtd">
 <root/>
 

If you parse this with a null entity resolver, or you use the default resolver (EntityCatalog.getDefault()) but do not do anything special with this DTD, you will probably find the parse blocking to make a network connection even when you are not validating. That is because DTDs can be used to define entities and other XML oddities, and are not a pure constraint language like Schema or RELAX-NG.

There are three basic ways to avoid the network connection.

  1. Register the DTD. This is generally the best thing to do. See EntityCatalog's documentation for details, but for example in your layer use:

     <filesystem>
       <folder name="xml">
         <folder name="entities">
           <folder name="NetBeans">
             <file name="DTD_Foo_1_0"
                   url="nbres:/org/netbeans/modules/mymod/resources/foo-1_0.dtd">
               <attr name="hint.originalPublicID"
                     stringvalue="-//NetBeans//DTD Foo 1.0//EN"/>
             </file>
           </folder>
         </folder>
       </folder>
     </filesystem>
     

    Now the default system entity catalog will resolve the public ID to the local copy in your module, not the network copy. Additionally, anyone who mounts the "NetBeans Catalog" in the XML Entity Catalogs node in the Runtime tab will be able to use your local copy of the DTD automatically, for validation, code completion, etc. (The network URL should really exist, though, for the benefit of other tools!)

  2. You can also set an explicit entity resolver which maps that particular public ID to some local copy of the DTD, if you do not want to register it globally in the system for some reason. If handed other public IDs, just return null to indicate that the system ID should be loaded.

  3. In some cases where XML parsing is very performance-sensitive, and you know that you do not need validation and furthermore that the DTD defines no infoset (there are no entity or character definitions, etc.), you can speed up the parse. Turn off validation, but also supply a custom entity resolver that does not even bother to load the DTD at all:

     public InputSource resolveEntity(String pubid, String sysid)
         throws SAXException, IOException {
       if (pubid.equals("-//NetBeans//DTD Foo 1.0//EN")) {
         return new InputSource(new ByteArrayInputStream(new byte[0]));
       } else {
         return EntityCatalog.getDefault().resolveEntity(pubid, sysid);
       }
     }
     

Since:
release 3.2
Author:
Petr Kuzel

Method Summary
static org.w3c.dom.Document createDocument(java.lang.String rootQName, java.lang.String namespaceURI, java.lang.String doctypePublicID, java.lang.String doctypeSystemID)
          Creates empty DOM Document using JAXP factoring.
static org.xml.sax.XMLReader createXMLReader()
          Create a simple parser.
static org.xml.sax.XMLReader createXMLReader(boolean validate)
          Create a simple parser, possibly validating.
static org.xml.sax.XMLReader createXMLReader(boolean validate, boolean namespaceAware)
          Create a SAX parser from the JAXP factory.
static byte[] fromHex(char[] hex, int start, int len)
          Decodes data encoded using toHex.
static org.w3c.dom.Document parse(org.xml.sax.InputSource input, boolean validate, boolean namespaceAware, org.xml.sax.ErrorHandler errorHandler, org.xml.sax.EntityResolver entityResolver)
          Create from factory a DocumentBuilder and let it create a org.w3c.dom.Document.
static java.lang.String toAttributeValue(java.lang.String val)
          Escape passed string as XML attibute value (<, &, ' and " will be escaped.
static java.lang.String toElementContent(java.lang.String val)
          Escape passed string as XML element content (<, & and > in ]]> sequences).
static java.lang.String toHex(byte[] val, int start, int len)
          Can be used to encode values that contain invalid XML characters.
static void write(org.w3c.dom.Document doc, java.io.OutputStream out, java.lang.String enc)
          Writes a DOM document to a stream.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

createXMLReader

public static org.xml.sax.XMLReader createXMLReader()
                                             throws org.xml.sax.SAXException
Create a simple parser.

Returns:
createXMLReader(false, false)
Throws:
org.xml.sax.SAXException

createXMLReader

public static org.xml.sax.XMLReader createXMLReader(boolean validate)
                                             throws org.xml.sax.SAXException
Create a simple parser, possibly validating.

Parameters:
validate - if true, a validating parser is returned
Returns:
createXMLReader(validate, false)
Throws:
org.xml.sax.SAXException

createXMLReader

public static org.xml.sax.XMLReader createXMLReader(boolean validate,
                                                    boolean namespaceAware)
                                             throws org.xml.sax.SAXException
Create a SAX parser from the JAXP factory. The result can be used to parse XML files.

See class Javadoc for hints on setting an entity resolver. This parser has its entity resolver set to the system entity resolver chain.

Parameters:
validate - if true, a validating parser is returned
namespaceAware - if true, a namespace aware parser is returned
Returns:
XMLReader configured according to passed parameters
Throws:
FactoryConfigurationError - Application developers should never need to directly catch errors of this type.
org.xml.sax.SAXException - if a parser fulfilling given parameters can not be created

createDocument

public static org.w3c.dom.Document createDocument(java.lang.String rootQName,
                                                  java.lang.String namespaceURI,
                                                  java.lang.String doctypePublicID,
                                                  java.lang.String doctypeSystemID)
                                           throws org.w3c.dom.DOMException
Creates empty DOM Document using JAXP factoring. E.g.:

 Document doc = createDocument("book", null, null, null);
 

creates new DOM of a well-formed document with root element named book.

Parameters:
rootQName - qualified name of root element. e.g. myroot or ns:myroot
namespaceURI - URI of root element namespace or null
doctypePublicID - public ID of DOCTYPE or null
doctypeSystemID - system ID of DOCTYPE or null if no DOCTYPE required and doctypePublicID is also null
Returns:
new DOM Document
Throws:
org.w3c.dom.DOMException - if new DOM with passed parameters can not be created
FactoryConfigurationError - Application developers should never need to directly catch errors of this type.

parse

public static org.w3c.dom.Document parse(org.xml.sax.InputSource input,
                                         boolean validate,
                                         boolean namespaceAware,
                                         org.xml.sax.ErrorHandler errorHandler,
                                         org.xml.sax.EntityResolver entityResolver)
                                  throws java.io.IOException,
                                         org.xml.sax.SAXException
Create from factory a DocumentBuilder and let it create a org.w3c.dom.Document. This method takes InputSource. After successful finish the document tree is returned.

Parameters:
input - a parser input (for URL users use: new InputSource(url.toExternalForm())
validate - if true validating parser is used
namespaceAware - if true DOM is created by namespace aware parser
errorHandler - a error handler to notify about exception or null
entityResolver - SAX entity resolver or null; see class Javadoc for hints
Returns:
document representing given input, or null if a parsing error occurs
Throws:
java.io.IOException - if an I/O problem during parsing occurs
org.xml.sax.SAXException - is thrown if a parser error occurs
FactoryConfigurationError - Application developers should never need to directly catch errors of this type.

write

public static void write(org.w3c.dom.Document doc,
                         java.io.OutputStream out,
                         java.lang.String enc)
                  throws java.io.IOException
Writes a DOM document to a stream. The precise output format is not guaranteed but this method will attempt to indent it sensibly.

Important: There might be some problems with <![CDATA[ ]]> sections in the DOM tree you pass into this method. Specifically, some CDATA sections my not be written as CDATA section or may be merged with other CDATA section at the same level. Also if plain text nodes are mixed with CDATA sections at the same level all text is likely to end up in one big CDATA section.
For nodes that only have one CDATA section this method should work fine.

Parameters:
doc - DOM document to be written
out - data sink
enc - XML-defined encoding name (e.g. "UTF-8")
Throws:
java.io.IOException - if JAXP fails or the stream cannot be written to

toAttributeValue

public static java.lang.String toAttributeValue(java.lang.String val)
                                         throws java.io.CharConversionException
Escape passed string as XML attibute value (<, &, ' and " will be escaped. Note: An XML processor returns normalized value that can be different.

Parameters:
val - a string to be escaped
Returns:
escaped value
Throws:
java.io.CharConversionException - if val contains an improper XML character
Since:
1.40

toElementContent

public static java.lang.String toElementContent(java.lang.String val)
                                         throws java.io.CharConversionException
Escape passed string as XML element content (<, & and > in ]]> sequences).

Parameters:
val - a string to be escaped
Returns:
escaped value
Throws:
java.io.CharConversionException - if val contains an improper XML character
Since:
1.40

toHex

public static java.lang.String toHex(byte[] val,
                                     int start,
                                     int len)
Can be used to encode values that contain invalid XML characters. At SAX parser end must be used pair method to get original value.

Parameters:
val - data to be converted
start - offset
len - count
Since:
1.29

fromHex

public static byte[] fromHex(char[] hex,
                             int start,
                             int len)
                      throws java.io.IOException
Decodes data encoded using toHex.

Parameters:
hex - data to be converted
start - offset
len - count
Throws:
java.io.IOException - if input does not represent hex encoded value
Since:
1.29