Trying to read in an XML File:
<?xml version="1.0" encoding="utf-8"?> <INQUIRY version="4.0"> <AUTHENTICATION> <LICENSEKEY>XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX</LICENSEKEY> <PASSWORD>YYYYYYYYYYY</PASSWORD> </AUTHENTICATION> <QUERY> <TRACKID>1-1-1</TRACKID> <TYPE>VALID</TYPE> <CHANNEL>INTERNET</CHANNEL> <INQUIRYTYPE>O</INQUIRYTYPE> <DATA> <NAME>BARNES & NOBLE</NAME> </DATA> </QUERY> </INQUIRY>
With the code:
install.packages("XML") library(XML) location <- "C:/Users/Desktop/temp" filenames=dir(location) for (i in 1:length(filenames)){ print(i) data <- xmlParse(paste0(location,"/",filenames[i])) TMP<-xmlToDataFrame(nodes=getNodeSet(data,"//DATA")) DF<-rbind(TMP,DF) }
Which works for most files, but the ampersand seems to be throwing a wrench in this particular file as I get the error:
xmlParseEntityRef: no name Error: 1: xmlParseEntityRef: no name
I've seen in places that the ampersand is the most likely culprit, but if I try to read in the file and replace it on the fly I get an annoying warning.
Warning message: In readLines(paste0(location, "/", filenames[i])) : incomplete final line found on 'C:/Users/Desktop/tmp/1-1-1_req.XML'
What is another work around to replace the ampersand and/or any ideas on why it reads the final line as incomplete?
No comments:
Post a Comment