However I found a lot of answers here, unfortunately, they don't work for me.
I have Ubuntu x64, python 3.4.2.
I am parsing a web page with html entities like
among others:
import xml.etree.ElementTree as ET
page = 'some string I get from requests.get'
parser = ET.XMLParser()
parser.parser.UseForeignDTD(True)
tree = ET.fromstring(page, parser=parser)
A lot of answers contain this code in order to prevent errors like unknown entity &nbps;
. While I compile this code it throws an error:
AttributeError: 'xml.etree.ElementTree.XMLParser' object has no attribute 'parser'
or
AttributeError: 'xml.etree.ElementTree.XMLParser' object has no attribute '_parser'
(depdends on what member of parser
object I wrote in 4th line in the code above. The magic is when I try to go to this code from PyCharm
IDE it shows me that this member is exists and successfully creates in class constructor:
# underscored names are provided for compatibility only
self.parser = self._parser = parser
self.target = self._target = target
My questions are:
- Why does not this work ?
- Is it possible not to add each entity manually to prevent parse errors?
No comments:
Post a Comment