Using Perl LibXML to read textContent that contains html tags



If I have the following XML:




<File id="MyTestApp/app/src/main/res/values/strings.xml">
<Identifier id="page_title" isArray="0" isPlural="0">
<EngTranslation eng_indx="0" goesWith="-1" index="0">My First App</EngTranslation>
<Description index="0">Home page title</Description>
<LangTranslation index="0">My First App</LangTranslation>
</Identifier>
<Identifier id="count" isArray="0" isPlural="0">
<EngTranslation eng_indx="0" goesWith="-1" index="0">You have <b>%1$d</b> view(s)</EngTranslation>
<Description index="0">Number of page views</Description>
<LangTranslation index="0">You have <b>%1$d</b> views</LangTranslation>
</Identifier>
</File>


I'm trying to read the 'EngTranslation' text value, and want to return the full value including any HTML tags. For example, I have the following:



my $parser = XML::LibXML->new;
my $dom = $parser->parse_file("test.xml") or die;

foreach my $file ($dom->findnodes('/File')) {
print $file->getAttribute("id")."\n";
foreach my $identifier ($file->findnodes('./Identifier')) {
print $identifier->getAttribute("id")."\n";
print encode('UTF-8',$identifier->findnodes('./EngTranslation')->get_node(1)->textContent."\n");
print encode('UTF-8',$identifier->findnodes('./Description')->get_node(1)->textContent."\n");
print encode('UTF-8',$identifier->findnodes('./LangTranslation')->get_node(1)->textContent."\n");
}
}


The output I get is: MyTestApp/app/src/main/res/values/strings.xml page_title My First App Home page title My First App count You have %1s views Number of page view You have %1s views


What I'm hoping to get is: MyTestApp/app/src/main/res/values/strings.xml page_title My First App Home page title My First App count You have %1s views Number of page view You have %1s views


I'm just using this as an example for a more complicated situation, hopefully it makes sense.


Thanks!


No comments:

Post a Comment