I'm processing large (1TB) XML files using the StAX API. Let's assume we have a loop handling some elements:
XMLInputFactory fac = XMLInputFactory.newInstance(); XMLStreamReader reader = fac.createXMLStreamReader(new FileReader(inputFile)); while (true) { if (reader.nextTag() == XMLStreamConstants.START_ELEMENT){ // handle contents } } How do I keep track of overall progress within the large XML file? Fetching the offset from reader works fine for smaller files:
int offset = reader.getLocation().getCharacterOffset(); but being an Integer offset, it'll probably only work for files up to 2GB...
No comments:
Post a Comment