I have the following xml :
<test1>
<test2>
<text>This is a question on xpath
</text>
</test2>
<test3>
<test2>
<text>Do not extract this
</text>
</test2>
</test3>
</test1>
I need to extract text within test2/text but not if test2 comes inside test3. How can this be done in xpath ? I tried with findall with something like:
for p in lxml_tree.xpath('.//test2',namespaces={'w':w}):
for q in p.iterancestors():
if q.tag=="test3":
break
else:
text+= ''.join(t.text for t in p.xpath('.//text'))
but this doesn't work . I guess xpath has a better way in a single expression to exclude it. Expected output: text = "This is a question on xpath"
No comments:
Post a Comment