How to call perl script, passing string on STDIN and receiving output as string?



I'm trying to process content of a mysql db with latin1 encoding with perl, HTML::Tidy and XML / XSLT. It seems almost impossible to handle the data from mysql so libXML::Parser or the XSLT engine doesn't complain because of wrong encoding. On http://ift.tt/1tC5cqO, i found a script to translate mixed latin1/utf8 content to "clean" utf8, but i'm not sure how to call it from my own perl script, passing a string and receiving the output. One of the attempts was



use IPC::Run3
run3('repair_utf8.pl', \$xml, \$fileoutput, $stderr);


and changing the penultimate line of the script from print $out; to print STDOUT $out;.


But it seems that the content of $fileoutput is still mixed utf8/iso-8859-1.


I've aleady tried other approaches to "clean up" the db output, e.g. $xml = decode( 'iso-8859-1', $xml ); but after some HTML::Tidy->clean() and XML::Saxon::XSLT2->transform, XML::LibXML::Parser still complains about non-UTF8 characters. Is there any best practice how to transform latin1 content into utf8-based XML?


No comments:

Post a Comment