Monday, 2 February 2015

Python regEx to find positions of xml data



I want to extract the position of XML data with python regEx or using any other method and the data part can be numbers, words,ip or any tags.



PUT /mg/co.xml HTTP/1.1
Host: 19.16.7.59
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:31.0) Gehko/20100101 Firefox/31.0

<?xml version="1.0" encoding="UTF-8"?>
<!-- THIS DATA SUBJECT TO DISCLAIMER(S) INCLUDED WITH THE PRODUCT OF ORIGIN. -->
<io:zzzz xmlns:io="http://kfj/ledm/iomgmt/2008/11/30" xmlns:dd="http://jkfhkj/dictionaries/1.0/" xmlns:dd3="http://jfja/dictionaries/2009/04/06" xmlns:xsi="http://ift.tt/ra1lAU" xsi:schemaLocation="http://jcjhjk/ledm/iomgmt/2008/11/30 ../../schemas/gfjbj.xsd">
<io:aaaa>
<dd3:bbbb>hjgjg</dd3:bbbb>
</io:aaaa>
<io:ccccc>
<io:dddd>
<dd3:ffff>15.34.2.5</dd3:ffff>
</io:dddd>
<io:eeee>
<dd3:gggg>67</dd3:gggg>
</io:eeee>
<io:iiii>
<dd3:jjjj><script>jgfjkgkj</script></dd3:jjjj>
</io:iiii>
</io:cccc>
</io:zzzz>


Expected Output:



(the data given below are approximate positions)

hjgjg [start off = 59, end off= 64]
15.43.2.5 [start off= 74, end off= 84]
67 [start offset=95, end off=97]
<script>jghjhdjk</script>[ start offset=102, end off=124]


Can anybody please help me sorting out this?


No comments:

Post a Comment