I am attempting to write a loop to perform readHTMLTable() over a list of sequential dates that I provide via a formula. I have successfully imported all the data between the dates. However, that data does not feature a date column, so using the sequence of dates I provide the loop, I would like the loop to readHTMLTable and then add a new column with the date it used for that iteration.
Here is what I have so far:
library(XML) library(RCurl) library(plyr) # create the days x <- seq(as.Date("2015-04-10"), as.Date("2015-04-15"), by = "day") # create a url template for sprintf() utmp <- "http://www.basketball-reference.com/friv/dailyleaders.cgi?month=%d&day=%d&year=%d" # convert to numeric matrix after splitting for year, month, day m <- do.call(rbind, lapply(strsplit(as.character(x), "-"), type.convert)) # create the list to hold the results tables <- vector("list", length(m)) # get the tables for(i in seq_len(nrow(m))) { # create the url for the day and if it exists, read it - if not, NULL tables[[i]] <- if(url.exists(u <- sprintf(utmp, m[i, 2], m[i, 3], m[i, 1]))) readHTMLTable(u, stringsAsFactors = FALSE) else NULL } data <- ldply(tables,data.frame) So basically, I would like my final data frame to feature m as a new column called something like data$Date.
Thanks for any and all help and let me know if you need any clarification!
No comments:
Post a Comment