Tuesday, 11 October 2016

XML : Python scraping via xml prints empty brackets

I am trying to extract just a few characters from a website via lxml, to tree, then xpath. I've tried using google chrome to obtain the correct xpath yet it prints empty brackets.

      #imports      from lxml import html      import requests        #get magicseaweed Scripps report      msScrippsPage = requests.get("""http://magicseaweed.com/Scripps-Pier-      La-Jolla-Surf-Report/296/.html""")        #make tree from site      msScrippsTree = html.fromstring(msScrippsPage.content)        #get wave size      msScrippsWave = msScrippsTree.xpath("""/html/body/div[2]/div[5]/div/div[1]/div[2]/div[2]/div/div[2]/div[1]/div/div[1]/div/div/div/div/div[1]/div/div[2]/ul[1]/li[1]/text()""")        print 'ms SCripps: ', msScrippsWave    

The output to terminal is 'msScripps: [ ]'

No comments:

Post a Comment