I am not familiar with importing XML files into R. I have looked into the existing questions regarding this topic but could not find one that seems to fit. I am thankful for your comments!
My problem is the following: I have an XML file with the following structure
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?> <PublicationData> <Products numberOfProducts="2094"> <Product terminationReason="" isSalePermission="false" soldoutDeadline="" exhaustionDeadline="" name="Résanol Trio " wNbr="6016" id="7034"> <ProductInformation> <ProductCategory primaryKey="7282"/> <ProductCategory primaryKey="7282"/> <FormulationCode primaryKey="6486"/> <DangerSymbol primaryKey="6513"/> <DangerSymbol primaryKey="6509"/> <CodeS primaryKey="6145"/> <CodeS primaryKey="6117"/> <CodeS primaryKey="6039"/> <CodeS primaryKey="6057"/> <CodeS primaryKey="6066"/> <CodeS primaryKey="6106"/> <CodeS primaryKey="6076"/> <CodeS primaryKey="6088"/> <CodeR primaryKey="5977"/> <CodeR primaryKey="5943"/> <CodeR primaryKey="5945"/> <CodeR primaryKey="5948"/> <CodeR primaryKey="6020"/> <PermissionHolderKey primaryKey="10115"/> <Ingredient additionalTextPrimaryKey="" inGrammPerLitre="" inPercent="7.5"> <SubstanceType xsi:type="xs:string" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">ACTIVE_INGREDIENT</SubstanceType> <Substance primaryKey="898"/> </Ingredient> <Ingredient additionalTextPrimaryKey="" inGrammPerLitre="" inPercent="40.0"> <SubstanceType xsi:type="xs:string" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">ACTIVE_INGREDIENT</SubstanceType> <Substance primaryKey="338"/> </Ingredient> <Ingredient additionalTextPrimaryKey="" inGrammPerLitre="" inPercent="15.0"> <SubstanceType xsi:type="xs:string" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">ACTIVE_INGREDIENT</SubstanceType> <Substance primaryKey="190"/> </Ingredient> <Indication expenditureTo="" expenditureForm="8.0" waitingPeriod="" dosageTo="" dosageFrom="0.5"> <Measure primaryKey="6518"/> <ApplicationArea primaryKey="3"/> <ApplicationComment primaryKey="868"/> <Culture additionalTextPrimaryKey="" primaryKey="9953"/> <Pest type="PEST_FULL_EFFECT" additionalTextPrimaryKey="" primaryKey="10506"/> <Pest type="PEST_FULL_EFFECT" additionalTextPrimaryKey="6964" primaryKey="10508"/> <Pest type="PEST_FULL_EFFECT" additionalTextPrimaryKey="" primaryKey="10507"/> <Pest type="PEST_PARTIAL_EFFECT" additionalTextPrimaryKey="" primaryKey="10533"/> <Obligation primaryKey="12317"/> <Obligation primaryKey="11380"/> <Obligation primaryKey="9156"/> <Obligation primaryKey="9735"/> <Obligation primaryKey="9906"/> </Indication> </ProductInformation> </Product> I am trying to extract the information in the "ProductInformation" Node and tried
xmlfile<-xmlParse("filepath") relevant<-xpathApply(xmlfile,"//*/Products/Product") relevant2<-sapply(relevant,xmlValue) Which just gives me something like
[1] "ACTIVE_INGREDIENTACTIVE_INGREDIENT" [...] "ACTIVE_INGREDIENT" [2094] "ACTIVE_INGREDIENT" Using instead
relevant2<-sapply(relevant,xmlAttrs) If was also not able to extract the information.
The answer to my problem is probably obvious but I cannot figure the answer out. Thanks for your help!
No comments:
Post a Comment