XML : xml_split(1) an XML file based on perl regex or xpath

I have a huge xml file that I want to split it into chunks based on the product type attribute! I don't know how to use XSLT but I found xml_split but can't figure out how to use it with regex or xpath to split the xml document depending on the "type" attribute

  <?xml version="1.0"?>  <!DOCTYPE catalog SYSTEM "catalog.dtd">  <catalog>     <product type="cloths" product_image="cardigan.jpg">        <catalog_item gender="Men's">           <item_number>QWZ5671</item_number>           <price>39.95</price>           <size description="Medium">              <color_swatch image="red_cardigan.jpg">Red</color_swatch>              <color_swatch image="burgundy_cardigan.jpg">Burgundy</color_swatch>           </size>           <size description="Large">              <color_swatch image="red_cardigan.jpg">Red</color_swatch>              <color_swatch image="burgundy_cardigan.jpg">Burgundy</color_swatch>           </size>        </catalog_item>        <catalog_item gender="Women's">           <item_number>RRX9856</item_number>           <price>42.50</price>           <size description="Small">              <color_swatch image="red_cardigan.jpg">Red</color_swatch>              <color_swatch image="navy_cardigan.jpg">Navy</color_swatch>              <color_swatch image="burgundy_cardigan.jpg">Burgundy</color_swatch>           </size>           <size description="Medium">              <color_swatch image="red_cardigan.jpg">Red</color_swatch>              <color_swatch image="navy_cardigan.jpg">Navy</color_swatch>              <color_swatch image="burgundy_cardigan.jpg">Burgundy</color_swatch>              <color_swatch image="black_cardigan.jpg">Black</color_swatch>           </size>           <size description="Large">              <color_swatch image="navy_cardigan.jpg">Navy</color_swatch>              <color_swatch image="black_cardigan.jpg">Black</color_swatch>           </size>           <size description="Extra Large">              <color_swatch image="burgundy_cardigan.jpg">Burgundy</color_swatch>              <color_swatch image="black_cardigan.jpg">Black</color_swatch>           </size>        </catalog_item>     </product>  </catalog>    

I used the following xml_split -c /catalog/product[@type='cloths'] products.xml

but it reproduce the complete xml without the xpath filtering!

No comments:

Post a Comment