Back in February, in a slightly plaintive post, the W3 sysadmins asked that people stop hammering their servers with requests for XHTML DTDs. Everyone said yes, this is a stupid problem that wouldn’t have happened if a) the XML spec were less dumb, or b) XML libraries were less dumb.
After that post, I spent two whole days fighting with XML catalogs — possibly the worst-documented XML spec ever — to make sure my Java code wasn’t downloading a DTD every time it read an XHTML document.
To my annoyance, no one seems to have posted any cut-and-paste solutions to this problem. Setting properties on the SAX parser is no help, and the XML catalogs solution is a pain to set up.
So what if someone wrote a “dummy” XML entity resolver that does nothing? Here’s what I came up with:
public class DummyEntityResolver implements EntityResolver { public InputSource resolveEntity(String publicID, String systemID) throws SAXException { return new InputSource(new StringReader("")); } }
Lo and behold, it works! The key is the return line — if you return null, the SAX parser reverts to its default behavior and downloads the DTD.
Use it like this:
XMLReader reader = XMLReaderFactory.createXMLReader(); reader.setEntityResolver(new DummyEntityResolver()); reader.setContentHandler(new YourContentHandler()); reader.parse(your_xml_source);
The catch is that this will break any externally-defined entities, including standard XHTML entities like ©
. The built-in XML entities such as &
, and numeric character entities like &x43;
, will still work.
You can check that you’re not downloading any DTD’s by watching the output of ngrep -q DTD
while running your XML parser. If it doesn’t print anything, you’re good.
Thanks this was just what I needed – I myself was a little baffled as to why there was not more information about this on the web, but anyway your solution worked out really well.
Awesome.
I had the exact same problem. Your solution worked.
I must point however, that while your solution works perfectly the following, which seems like it could work also, does not:
SAXParser saxParser = factory.newSAXParser();
saxParser.getXMLReader().setEntityResolver(new DummyEntityResolver());
saxParser.parse(new InputSource(conn.getInputStream(), new YourHandler()));
Great!
That’s my solution too!
And if you still need the DTD’s to be read you can download and save as a file and you don’t need any internetconnection to run your programm. See below:
public class DummyEntityResolver implements EntityResolver {
public InputSource resolveEntity(String publicID, String systemID) throws SAXException {
try {
return new InputSource(new FileInputStream(“temp/PropertyList-1.0.dtd”));
} catch (FileNotFoundException e) {
e.printStackTrace();
return null;
}
}
}
Thank you very much. Works fine!
Thanks!