XML : sed edit, delete xml tags

I'm newbie with great editor called - sed.

I want to delete all the xml tags and extract string between specific tag - reportBody

Here how is it looks like in a single line:

  <?xml version="1.0" ?><SOAP- ENV:Envelope xmlns:SOAP-ENV="blablah"><SOAP-ENV:Body> <getReportResponsexmlns:msgns="blahblahblah" xmlns="blahblah"><returnxmlns=""> <returnCode><majorReturnCode>000</majorReturnCode><minorReturnCode>0000</minorReturnCode><returnCode><reportName>blahblah</reportName><reportTitle>blahblahblahr</reportTitle><reportBody>STRING TO EXTRACT</reportBody><reportMimeType>text/csv</reportMimeType></return></getReportResponse></SOAP-ENV:Body></SOAP-ENV:Envelope>    

The problem is that xml file CAN be different, sometimes it's written in a single line either written in 2-3 lines or the string to extract will be stored on more than 1 line between reportBody tag. so it can be something like that or even different:

      <?xml version="1.0" ?><SOAP- ENV:Envelope xmlns:SOAP-ENV="blablah"><SOAP-ENV:Body>   `enter code here`<getReportResponsexmlns:msgns="blahblahblah" xmlns="blahblah">  <returnxmlns=""> <returnCode>  <majorReturnCode>000</majorReturnCode><minorReturnCode>0000</minorReturnCode>  <returnCode>  <reportName>blahblah</reportName><reportTitle>blahblahblahr</reportTitle><reportBody>  STRING   TO   EXTRACT</reportBody>  <reportMimeType>text/csv</reportMimeType></return>  </getReportResponse></SOAP-ENV:Body></SOAP-ENV:Envelope>    

What is the solution to deal with all the possible changes? Also, can I set parameters to save files and decode string to base64? Thanks !

No comments:

Post a Comment