XML Validation with the Java API

Both, Java and XML are spread widely and used intensively. This post sheds some light onto the possibilities on validating XML files with the JAVA API.  In all code listings, the exception handling is omitted as well as the imports. The classes used are from the javax.xml, java.xml.parser and java.xml.validation packages. Moreover, this post focus on simple validation code snippets.

First, we can check if the XML file is well-formed. This can be done by parsing the XML file into a DOM document.

Next, it is possible to validate the XML file against a XML schema. However, in this case we validate the XSD file first to ensure that it is valid itself.

With the valid XSD file we can validate the XML file against this schema.

As quite some XML files do use multiple XML Schemas, the code above will always fail. Therefore, we need to create a Schema which consists of multiple XSD files.

When shipping the XSD files within the jar, it is required to reference them by resource instead.

Advertisements

A subtle difference …

I am currently developing with Intellij IDEA Community Edition (which is open source and free as in beer). Intellij supports Java7 very well and according my subjective opinion a little richer on features regarding refactoring or intelligent code suggestions.

Aim

My aim was to load some XML Schema files for validation purposes. Such files use the file ending xsd standing for XML Schema Definition. Now to the fun part:

You can load such files directly using the File class. This is simple, but, when packaging your application as a jar, this does not work anymore if the xsd file is within your jar. You have to reference the xsd file pointing in a jar file. This can be done by leveraging the class path and its package structure using the class loader.

If you have an xsd file name test.xsd in a package called logic along with a java class Solver you can do the following things:

//inside instance methods of a Solver instance
InputStream stream = this.getClass().getResourceAsStream("test.xsd")
//inside static methods of Solver
InputStream stream = Solver.class.getResourceAsStream("test.xsd")

//inside instance methods of any instance
InputStream stream = this.getClass().getResourceAsStream("/logic/test.xsd")
//inside static methods of any class
InputStream stream = AnyClass.class.getResourceAsStream("/logic/test.xsd")

More information on this can be found here.

Problem

And here is the very important information: Intellij IDEA does NOT copy your xsd files automatically to your binary folder. Consequently, using getResource[asStream] will NOT work. In Eclipse, everything worked fine.

Why?

Intellij IDEA has semi-colon separated list of regular expressions for files to be copied which is sold as a feature. This approach uses a white list compared to a black list in Eclipse.

How to solve?

[Settings] -> [Compiler] -> [Resource Patterns] and append ;?*.xsd to the resource pattern input field.