I am trying to import updated info from a google docs spreadsheet that is updated on an irregular time basis with R package XML. for some reason it reads sometimes only the first sheet and sometimes all the sheets (there are 25 in this document). an extra caveat is that there is hebrew in the spreadsheets and its not being encoded correctly - most of the time.
url="http://ift.tt/1zbUs4o"
Sys.setlocale("LC_ALL", "Hebrew")
readGoogleSheet <- function(url, na.string="", header=TRUE){
require(XML)
doc <- paste(readLines(url), collapse=" ")
htmlTable <- gsub("^.*?(<table.*</table).*$", "\\1>", doc)
ret <- readHTMLTable(htmlTable,
header=T,
stringsAsFactors=FALSE,
as.data.frame=TRUE,
.Encoding="UTF-8")
No comments:
Post a Comment