Bash, Remove empty XML tags



I need some help a couple of questions, using bash tools



  1. I want to remove empty xml tags from a file eg:



<CreateOfficeCode>
<OperatorId>ve</OperatorId>
<OfficeCode>1234</OfficeCode>
<CountryCodeLength>0</CountryCodeLength>
<AreaCodeLength>3</AreaCodeLength>
<Attributes></Attributes>
<ChargeArea></ChargeArea>
</CreateOfficeCode>


to become:



<CreateOfficeCode>
<OperatorId>ve</OperatorId>
<OfficeCode>1234</OfficeCode>
<CountryCodeLength>0</CountryCodeLength>
<AreaCodeLength>3</AreaCodeLength>
</CreateOfficeCode>


for this I have done so by this command



sed -i '/><\//d' file


which is not so strict, its more like a trick, something more appropriate would be to find the <pattern><\pattern> and remove it. Suggestion?



  1. Second, how to go from:



<CreateOfficeGroup>
<CreateOfficeName>John<\CreateOfficeName>
<CreateOfficeCode>
</CreateOfficeCode>
</CreateOfficeGroup>


to:



<CreateOfficeGroup>
<CreateOfficeName>John<\CreateOfficeName>
</CreateOfficeGroup>



  1. As a whole thing? from:



<CreateOfficeGroup>
<CreateOfficeName>John<\CreateOfficeName>
<CreateOfficeCode>
<OperatorId>ve</OperatorId>
<OfficeCode>1234</OfficeCode>
<CountryCodeLength>0</CountryCodeLength>
<AreaCodeLength>3</AreaCodeLength>
<Attributes></Attributes>
<ChargeArea></ChargeArea>
</CreateOfficeCode>
<CreateOfficeSize>
<Chairs></Chairs>
<Tables></Tables>
</CreateOfficeSize>
</CreateOfficeGroup>


to:



<CreateOfficeGroup>
<CreateOfficeName>John<\CreateOfficeName>
<CreateOfficeCode>
<OperatorId>ve</OperatorId>
<OfficeCode>1234</OfficeCode>
<CountryCodeLength>0</CountryCodeLength>
<AreaCodeLength>3</AreaCodeLength>
</CreateOfficeCode>
</CreateOfficeGroup>


Can you answer the questions as individuals? Thank you very much!


No comments:

Post a Comment