XML Validation with the Java API

Both, Java and XML are spread widely and used intensively. This post sheds some light onto the possibilities on validating XML files with the JAVA API.  In all code listings, the exception handling is omitted as well as the imports. The classes used are from the javax.xml, java.xml.parser and java.xml.validation packages. Moreover, this post focus on simple validation code snippets.

First, we can check if the XML file is well-formed. This can be done by parsing the XML file into a DOM document.

String path = "my/test/path/to/file.xml";
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
db.parse(new File(path));

view raw

gistfile1.java

hosted with ❤ by GitHub

Next, it is possible to validate the XML file against a XML schema. However, in this case we validate the XSD file first to ensure that it is valid itself.

String path = "my/test/path/to/file.xsd";
SchemaFactory sFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
sFactory.newSchema(new File(path));

view raw

gistfile1.java

hosted with ❤ by GitHub

With the valid XSD file we can validate the XML file against this schema.

String path = "my/test/path/to/file.xml";
SchemaFactory sFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = sFactory.newSchema(new File("path/to/schema.xsd"));
Validator validator = schema.newValidator();
validator.validate(new StreamSource(new File(path)));

view raw

gistfile1.java

hosted with ❤ by GitHub

As quite some XML files do use multiple XML Schemas, the code above will always fail. Therefore, we need to create a Schema which consists of multiple XSD files.

String path = "my/test/path/to/file.xml";
SchemaFactory sFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = sFactory.newSchema(new Source[]{
new StreamSource(new File("/path/to/schema1.xsd")),
new StreamSource(new File("/path/to/schema2.xsd"))
});
Validator validator = schema.newValidator();
validator.validate(new StreamSource(new File(path)));

view raw

gistfile1.java

hosted with ❤ by GitHub

When shipping the XSD files within the jar, it is required to reference them by resource instead.

String path = "my/test/path/to/file.xml";
SchemaFactory sFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = sFactory.newSchema(new Source[]{
new StreamSource(getClass().getResourceAsStream("/path/to/schema1.xsd")),
new StreamSource(getClass().getResourceAsStream("/path/to/schema2.xsd"))
});
Validator validator = schema.newValidator();
validator.validate(new StreamSource(new File(path)));

view raw

gistfile1.java

hosted with ❤ by GitHub

Netbeans and BPEL *Update*

My current project involves the use of BPEL. I decided to work with Netbeans and the Glassfish v2 using openESB as it provides quite a good tool integration.

However, I do need to create projects and deploy about 32 BPEL Processes along with quite a few WSDL Web Service implementations.

What is the problem:
1. BPEL validation cannot be automated
I use a MDA approach to generate my BPEL code using Eclipse EMF and cannot automatically validate these files using Netbeans. The BPEL Validator cannot be called from outside Netbeans. I searched a lot, but I haven’t found anyone who could do this and many people said it is not possible.

What a pity …..

*Update*

It can be done using ANT.

– Make sure ant is installed (by typing ant into the commandline)
– go into the directory and execute the following
ant -Desb.netbeans.home=PATH_TO_YOUR_NETBEANS

– if you need more detailed information, use -verbose

By building the default target, the bpel processes are automatically validated.

2. BPEL Validation in Netbeans does only run fast if each BPEL file with its WSDLs and XSDs is in its own project.

Each BPEL file in own project: Validation Time: 5 seconds
Every BPEL File in same project: Validation Time for each BPEL File: 75 seconds

Time may vary as the size is not always the same, but you can get the dif-factor: 15 …

So, use small projects.

After some thought, the problem arises during the evaluation of the XPath expressions in the copy constructs.

Maybe it is possible to use absolute locations for the BPEL imports to reduce the time to validate, but this results in even more problems when copying and deploying on the glassfish ….