XML Validation with the Java API

Both, Java and XML are spread widely and used intensively. This post sheds some light onto the possibilities on validating XML files with the JAVA API.  In all code listings, the exception handling is omitted as well as the imports. The classes used are from the javax.xml, java.xml.parser and java.xml.validation packages. Moreover, this post focus on simple validation code snippets.

First, we can check if the XML file is well-formed. This can be done by parsing the XML file into a DOM document.

String path = "my/test/path/to/file.xml";
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
db.parse(new File(path));

view raw
gistfile1.java
hosted with ❤ by GitHub

Next, it is possible to validate the XML file against a XML schema. However, in this case we validate the XSD file first to ensure that it is valid itself.

String path = "my/test/path/to/file.xsd";
SchemaFactory sFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
sFactory.newSchema(new File(path));

view raw
gistfile1.java
hosted with ❤ by GitHub

With the valid XSD file we can validate the XML file against this schema.

String path = "my/test/path/to/file.xml";
SchemaFactory sFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = sFactory.newSchema(new File("path/to/schema.xsd"));
Validator validator = schema.newValidator();
validator.validate(new StreamSource(new File(path)));

view raw
gistfile1.java
hosted with ❤ by GitHub

As quite some XML files do use multiple XML Schemas, the code above will always fail. Therefore, we need to create a Schema which consists of multiple XSD files.

String path = "my/test/path/to/file.xml";
SchemaFactory sFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = sFactory.newSchema(new Source[]{
new StreamSource(new File("/path/to/schema1.xsd")),
new StreamSource(new File("/path/to/schema2.xsd"))
});
Validator validator = schema.newValidator();
validator.validate(new StreamSource(new File(path)));

view raw
gistfile1.java
hosted with ❤ by GitHub

When shipping the XSD files within the jar, it is required to reference them by resource instead.

String path = "my/test/path/to/file.xml";
SchemaFactory sFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = sFactory.newSchema(new Source[]{
new StreamSource(getClass().getResourceAsStream("/path/to/schema1.xsd")),
new StreamSource(getClass().getResourceAsStream("/path/to/schema2.xsd"))
});
Validator validator = schema.newValidator();
validator.validate(new StreamSource(new File(path)));

view raw
gistfile1.java
hosted with ❤ by GitHub