XML : Parsing out XML with Ampersand within file (R)

Trying to read in an XML File:

  <?xml version="1.0" encoding="utf-8"?>      <INQUIRY version="4.0">          <AUTHENTICATION>              <LICENSEKEY>XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX</LICENSEKEY>               <PASSWORD>YYYYYYYYYYY</PASSWORD>           </AUTHENTICATION>          <QUERY>              <TRACKID>1-1-1</TRACKID>               <TYPE>VALID</TYPE>              <CHANNEL>INTERNET</CHANNEL>              <INQUIRYTYPE>O</INQUIRYTYPE>              <DATA>                  <NAME>BARNES & NOBLE</NAME>              </DATA>          </QUERY>      </INQUIRY>    

With the code:

  install.packages("XML")  library(XML)    location <- "C:/Users/Desktop/temp"  filenames=dir(location)    for (i in 1:length(filenames)){  print(i)     data <- xmlParse(paste0(location,"/",filenames[i]))     TMP<-xmlToDataFrame(nodes=getNodeSet(data,"//DATA"))     DF<-rbind(TMP,DF)  }    

Which works for most files, but the ampersand seems to be throwing a wrench in this particular file as I get the error:

  xmlParseEntityRef: no name  Error: 1: xmlParseEntityRef: no name    

I've seen in places that the ampersand is the most likely culprit, but if I try to read in the file and replace it on the fly I get an annoying warning.

  Warning message:  In readLines(paste0(location, "/", filenames[i])) :    incomplete final line found on 'C:/Users/Desktop/tmp/1-1-1_req.XML'    

What is another work around to replace the ampersand and/or any ideas on why it reads the final line as incomplete?

No comments:

Post a Comment