Sunday, 13 July 2014

Same data causes R to crash when saved and re-loaded - but only if XML package intalled



I've come across something I assume is a bug, but I'm having trouble even figuring out how to track down what's happening.


I am using an automated query to generate some XML-formatted information. If I do the query and immediately use the XML package to parse it, it works fine. But if I save the data, then reload it, and even so much as look at it, I crash R:


With the XML package loaded, I create a variable named r with XML data. If I enter r[[1]] at this point I get back



<pointer: (nil)>
attr(,"class")
[1] "XMLInternalElementNode" "XMLInternalNode" "XMLAbstractNode"


However, if I do



save(r,"datafilename.RData")
load("datafilename.RData")


then it loads fine up until I enter r[[1]] again, at which point what happens depends on whether the XML package is still loaded up. If it is not, then I get the same result as above; if it is, I get a pop-up stating that R has encountered a fatal error and needs to close.


I have regenerated the same dataset as well as a new and smaller test dataset from the same source a dozen times - this is completely and utterly reproducible / happens every time. I don't understand the options on "save" very well, but I've tried saving as binary and ascii, and with compress left to its default or set to FALSE, and these do not make any difference. At this point, I'm having trouble figuring out how to track down the problem any further.


Any ideas, for example, what changes when something is saved and reloaded? Are there further ways I could be controlling this process to avoid this type of error?


Any help is appreciated - I have been stuck here for quite some time!


No comments:

Post a Comment