Python (xml.etree) not reading XML text

I've not worked with XML before, but am having trouble with getting text out of the following XML:


<w>
  <shortening>n</shortening>
  ūmi 
  <mor type="mor">
    <mw>
      [extra stuff]
    </mw>
    <menx>rest</menx>
    <menx>sleep</menx>
    <gra type="gra" relation="ROOT" head="0" index="1"/>
  </mor>
</w>

It doesn't recognise the text ūmi inside. I think this is because it is preceded by the <shortening> tag. This shouldn't be a Unicode issue, because there are plenty of other Unicode characters that read just fine (this is transliterated Hebrew).

Is there an easy way to fix this? Is this malformed XML?

Python (xml.etree) not reading XML text

No comments:

Post a Comment