I need to obtain some data from a web page. I'm trying to extract using R software.
Cause the information is in several pages firstly I write this code:
require(XML) contador<-c(1:200) for(i in contador){ myURL<-paste("http://www.europa-mop.com/excavadoras-usadas/2-1/anuncios-excavadoras.html?p=",i,sep="") } Secondly, I read the web_url with the following code:
web_url<-getURL(myURL) web_url<-readLines(tc<-textConnection(web_url));close(tc) webtree<-htmlTreeParse(web_url,error=function(...){}) body<-webtree$children$html$children$body body Nevertheless when I execute the following command I obtain an error:
precio<-xpathSApply(body,"//li[@class='label label-secondary text-bold']",xmlValue) Input is not proper UTF-8, indicate encoding ! Bytes: 0xC2 0x3C 0x2F 0x64 Sequence ']]>' not allowed in content Sequence ']]>' not allowed in content internal error: detected an error in element content I've tried different alternatives but I don't get to scrap the information.
Tx for your comments!
No comments:
Post a Comment