R Web Scraping 'Failed to load HTTP Resource' with XML



I have created a function to scrape statistics from individual soccer match reports. I am usually able to use this to scrape stats from hundreds of matches at a time without a problem. However for the past week I have been away visiting family for the holidays cannot run my code at all without receiving the message:



Error: failed to load HTTP resource


I thought this was just my parents poor internet connection but I am now having the same problem wherever I go around the country!


My code starts with creating a vector of match codes for each game:



matchid <- c(369835, 369842, 369839, 369836, 369834, 369841)


My function is currently as below:



GetStats <- function(matchid){
Stats <- NULL
for(i in matchid){
urli <- paste("http://ift.tt/1wZnQxh", i, "/statistics.html", sep="")
datai <- readHTMLTable(urli)
hometeami <- datai[[1]]
awayteami <- datai[[2]]
hometeami <- hometeami[1:11,]
awayteami <- awayteami[1:11,]
matchdatai <- rbind(hometeami, awayteami)
matchdatai$match_id <- i
matchdatai <- matchdatai[c(14, 1:13)]
Stats <- rbind(Stats, matchdatai)}
return(Stats)
}


I then run my function on my individual match codes:



MatchStats <- GetStats(matchid)
View(MatchStats)


But it is at this point that I am now receive the error message.


Could someone please help me by helping me understand what might be causing this? As far as I can see nothing has changed in the websites I am trying to scrape. Is it something caused by the internet connections I am now using? Is there a simple fix I can add in to my code to overcome this?


As mentioned, I am able to run this on hundreds of matches at once on my own connection at my flat so I am very confused!


Thank you in advance


No comments:

Post a Comment