With XML query using R and XPath 1.0, unable to extract specific text



I would greatly appreciate guidance on how I can extract the four names of cities where this firm has offices. Firebug has the name greyed-out under cufontext, such as MEMPHIS and MEMPHIS is in grey. BTW, I don't mind getting some extraneous text back such as state or address. Three of my failed efforts are shown.



library(XML)

doc <- htmlTreeParse('http://ift.tt/1srNmbW', useInternal = TRUE, asText = TRUE)
xpathSApply(doc, "//div[@id = 'the_content']", xmlValue, trim = TRUE) # returns list()
xpathSApply(doc, "//div[@id = 'the_content']/div/h3//cufon", xmlValue, trim = TRUE) # returns NULL
xpathSApply(doc, "//div[@id = 'the_content']//cufon[@class = 'cufon cufon-canvas']", xmlValue, trim = TRUE) # returns NULL


Thank you very much.


No comments:

Post a Comment