Monday, 2 March 2015

parse tei xml with golang



i need to parse a tei xml file with golang. i tried using the encoding/xml unmarshaller. here the example: http://ift.tt/18gDYAX


problems:



  1. even if is not a valid tei file, the xml is valid. but the example returns me nothing. if i remove line 23 <TEI> (so xml is not anymore valid), the example prints something

  2. how i can have into the Line struct a string containing the content of <l> element?

  3. in the Page struct i need the value of n attribute. how?


is unmarshalling the right way to parse these kind of files? or nokogiri would be a better solution?


thanks


No comments:

Post a Comment