XML Validation with the Java API

Both, Java and XML are spread widely and used intensively. This post sheds some light onto the possibilities on validating XML files with the JAVA API.  In all code listings, the exception handling is omitted as well as the imports. The classes used are from the javax.xml, java.xml.parser and java.xml.validation packages. Moreover, this post focus on simple validation code snippets.

First, we can check if the XML file is well-formed. This can be done by parsing the XML file into a DOM document.

String path = "my/test/path/to/file.xml";
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
db.parse(new File(path));

view raw

gistfile1.java

hosted with ❤ by GitHub

Next, it is possible to validate the XML file against a XML schema. However, in this case we validate the XSD file first to ensure that it is valid itself.

String path = "my/test/path/to/file.xsd";
SchemaFactory sFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
sFactory.newSchema(new File(path));

view raw

gistfile1.java

hosted with ❤ by GitHub

With the valid XSD file we can validate the XML file against this schema.

String path = "my/test/path/to/file.xml";
SchemaFactory sFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = sFactory.newSchema(new File("path/to/schema.xsd"));
Validator validator = schema.newValidator();
validator.validate(new StreamSource(new File(path)));

view raw

gistfile1.java

hosted with ❤ by GitHub

As quite some XML files do use multiple XML Schemas, the code above will always fail. Therefore, we need to create a Schema which consists of multiple XSD files.

String path = "my/test/path/to/file.xml";
SchemaFactory sFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = sFactory.newSchema(new Source[]{
new StreamSource(new File("/path/to/schema1.xsd")),
new StreamSource(new File("/path/to/schema2.xsd"))
});
Validator validator = schema.newValidator();
validator.validate(new StreamSource(new File(path)));

view raw

gistfile1.java

hosted with ❤ by GitHub

When shipping the XSD files within the jar, it is required to reference them by resource instead.

String path = "my/test/path/to/file.xml";
SchemaFactory sFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = sFactory.newSchema(new Source[]{
new StreamSource(getClass().getResourceAsStream("/path/to/schema1.xsd")),
new StreamSource(getClass().getResourceAsStream("/path/to/schema2.xsd"))
});
Validator validator = schema.newValidator();
validator.validate(new StreamSource(new File(path)));

view raw

gistfile1.java

hosted with ❤ by GitHub

XSLT Tricks

Curly Brackets and Dollar Signs

When you want to generate the expression ${property.key} with XSLT and you simply write this in the template, XSLT will interpret the curly brackets as an XPath Expression and will print out the value it calculates. It will fail silently and the output just shows $ having evaluated {property.key} to an empty string.

The trick is to use double curly brackets as shown in the listing below. It will evaluate the expression in the first curley brackets pair which is an curly brackets expression getting returned as a string.

<!-- evaluates to $ -->
<xsl:template match="...">
${property.key}
</xsl:template>

<!-- evaluates to ${property.key} -->
<xsl:template match="...">
${{property.key}}
</xsl:template>

Adding types to variables

You can add types to variables in your stylesheet. This allows for better error messages as these types are checked during execution.

<!-- variable with dynamic type -->
<xsl:template match="...">
     <xsl:variable name="var1" select="..." />
</xsl:template>

<!-- variable with static type string -->

<xsl:template match="...">
     <xsl:variable name="var1" select="..." as="xs:string" xmlns:xs="http://www.w3.org/2001/XMLSchema" />
</xsl:template>

More information and other cool tricks for making XSLT2 safer can be found at http://www.ibm.com/developerworks/library/x-safexslt/