Bullet point entity is shown as  (unknown entity) which have to be captured by Perl program



I have an xml file where the data of the bullet point is shown in the form of box entity  which i am unable to capture using Perl program.Can someone help me on this point !!


Part of Input Data :


<p> Adding Basic Requirements: AU sec. 334 suggests procedures for the auditor's consideration, noting that not all of them may be required in every audit.<\/p>


Expected output :


<p>Adding Basic Requirements: AU sec. 334 suggests procedures for the auditor's consideration, noting that not all of them may be required in every audit.<\/p>


Perl Program :



use strict;
use warnings;
use utf8;
my $filename = $ARGV[0];
my $ext = $ARGV[1];
my $inputfile = $filename . "\." . $ext;
my $document = do {
local $/ = undef;
open my $fh,'<',$inputfile or die "Couldn't open the file $inputfile:$!";
<$fh>;
};

open my $out,">$filename.sgm" or die "Couldn\'t write to the file $filename.sgm:$!";

$document =~ s/?/<i>/isg;

print $out $document;


Output :


Program unable to capture that box type entity and results nothing . No change in the output


No comments:

Post a Comment