Given xml that looks like:
<collection>
<name>Bob</name>
<name>Bob</name>
<name>Linda</name>
</collection>
<collection>
<name>Linda</name>
<name>Tina</name>
</collection>
I want to merge the collections & remove the duplicates among the children of the collection element, so that I end up with:
<collection>
<name>Bob</name>
<name>Linda</name>
<name>Tina</name>
</collection>
Currently, I'm using lxml.etree to parse the xml & grab the children, then converting each child element to a string (e.g. 'Bob'), then converting a list of these strings to a set to get unique values, and then writing the unique values back into xml.
This seems circuitous & clunky, though - is there a more elegant way?
No comments:
Post a Comment