I have an XML file which looks like this:
<Organism>
<Name>Bacillus halodurans C-125</Name>
<Enzyme>M.BhaII</Enzyme>
<Motif>GGCC</Motif>
<Enzyme>M1.BhaI</Enzyme>
<Motif>GCATC</Motif>
<Enzyme>M2.BhaI</Enzyme>
<Motif>GCATC</Motif>
</Organism>
<Organism>
<Name>Bacteroides eggerthii 1_2_48FAA</Name>
</Organism>
Im trying to write it into a CSV file like this:
Bacillus halodurans, GGCC
Bacillus halodurans, GCATC
Bacillus halodurans, GCATC
Bacteriodes,
The way i approached this is to create a list of tuples which will have the organism name and the motif together. I tried this using the ElementTree module
import xml.etree.ElementTree as ET
tree = ET.parse('file.xml')
rebase = tree.getroot()
list = []
for organisms in rebase.findall('Organism'):
name = organisms.find('Name').text
for each_organism in organisms.findall('Motif'):
try:
motif = organisms.find('Motif').text
print name, motif
except AttributeError:
print name
However the output i get looks like this:
Bacillus halodurans, GGCC Bacillus halodurans, GGCC Bacillus halodurans, GGCC 
Only the first motif gets recorded. This is my first time working with ElementTree so its slightly confusing. Any help will be greatly appreciated
I dont need help with writing to a CSV file.
No comments:
Post a Comment