In R: how to extract the first xmlValue from multiple xmlValues in the XML format file



I have the following document, which is a XMLNodeSet, I would like to extract the first xml value, if the leaf node has no values then it should give NA value. The result I want should be this: "11557040" "23667301" "NA"


but after using sapply(doc, xmlValue) it give the result like the follows:



[1] "115570409626101208908" "2366730130010285360545" "\n\t"


Any help is appreciated.



> doc
[[1]]
<CompoundIDList>
<int>11557040</int>
<int>962</int>
<int>6101</int>
<int>208908</int>



[[2]]
<CompoundIDList>
<int>23667301</int>
<int>3001028</int>
<int>5360545</int>
</CompoundIDList>

[[3]]
<CompoundIDList>
</CompoundIDList>

attr(,"class")
[1] "XMLNodeSet"`

No comments:

Post a Comment