Saturday, 4 April 2015

Python 3.4.0 -- xpath -- gets me empty list



Trying to retrieve data from Ukrainian dictionary online everything works fine with:



url= "http://ift.tt/1xMYUM1"
page = urllib.request.urlopen(url)
pageWritten = page.read()
pageReady = pageWritten.decode('utf-8')
xmldata = lxml.html.document_fromstring(pageReady)
text1 = xmldata.xpath('//p[@class="MsoNormal"]//text()')


But nothing works out with another link:



from urllib.parse import urlparse, parse_qs, urlencode

url = 'http://ift.tt/19Yy4pn'
parsed_url = urlparse(url)
parameters = parse_qs(parsed_url.query)
url = parsed_url._replace(query=urlencode(parameters)).geturl()
page = urllib.request.urlopen(url)

pageWritten = page.read()
pageReady = pageWritten.decode('utf-8')
xmldata = lxml.html.document_fromstring(pageReady)
text1 = xmldata.xpath('//div[@itemprop="articleBody"]')


It gets me an empty list. Xpath is fine, while I double-checked it with Xpath Helper in Chrome.


Any ideas?


No comments:

Post a Comment