XML : Use readHTMLTable over a List of Dates and Create New Date Column with Data

I am attempting to write a loop to perform readHTMLTable() over a list of sequential dates that I provide via a formula. I have successfully imported all the data between the dates. However, that data does not feature a date column, so using the sequence of dates I provide the loop, I would like the loop to readHTMLTable and then add a new column with the date it used for that iteration.

Here is what I have so far:

  library(XML)  library(RCurl)  library(plyr)    # create the days  x <- seq(as.Date("2015-04-10"), as.Date("2015-04-15"), by = "day")    # create a url template for sprintf()  utmp <- "http://www.basketball-reference.com/friv/dailyleaders.cgi?month=%d&day=%d&year=%d"    # convert to numeric matrix after splitting for year, month, day  m <- do.call(rbind, lapply(strsplit(as.character(x), "-"), type.convert))    # create the list to hold the results  tables <- vector("list", length(m))    # get the tables  for(i in seq_len(nrow(m))) {    # create the url for the day and if it exists, read it - if not, NULL    tables[[i]] <- if(url.exists(u <- sprintf(utmp, m[i, 2], m[i, 3], m[i, 1])))       readHTMLTable(u, stringsAsFactors = FALSE)    else NULL  }    data <- ldply(tables,data.frame)    

So basically, I would like my final data frame to feature m as a new column called something like data$Date.

Thanks for any and all help and let me know if you need any clarification!

No comments:

Post a Comment