Parse error with same name nested tags using python xml to json



I have a small set back I have a large xml file in the following format



<doc id="1">Some text</doc>
<doc id="2">more text</doc>


Im using the following python script to convert into a json format:



from sys import stdout

import xmltodict
import gzip
import json

count = 0
xmlSrc = 'text.xml.gz'
jsDest = 'js/cd.js'

def parseNode(_, node):
global count
count += 1
stdout.write("\r%d" % count)

jsonNode = json.dumps(node)
f.write(jsonNode + '\n')
return True

f = open(jsDest, 'w')

xmltodict.parse(gzip.open(xmlSrc), item_depth=2, item_callback=parseNode)

f.close()

stdout.write("\n") # move the cursor to the next line


Is it possible to detected the end and break and then continue converting?


No comments:

Post a Comment