I'm new here so I apologize for my bad english. I have 2 files (file 1: main-XML-file and file 2: description-file) and I want to integrate the description line per line in a specific position (replace the XX in Hit_def) in the XML-file.
file 1:
... <Hit> <Hit_num>1</Hit_num> <Hit_id>gi|939543432|gb|KPV42113.1|</Hit_id> <Hit_def>XX</Hit_def> <Hit_accession>KPV42113.1</Hit_accession> <Hit_len>162</Hit_len> <Hit_hsps> <Hsp> ... <Hit> <Hit_num>2</Hit_num> <Hit_id>gi|385280362|gb|EIF44286.1|</Hit_id> <Hit_def>XX</Hit_def> <Hit_accession>EIF44286.1</Hit_accession> <Hit_len>327</Hit_len> <Hit_hsps> ... <Hit> <Hit_num>3</Hit_num> <Hit_id>gi|550913550|ref|WP_022666548.1|</Hit_id> <Hit_def>XX</Hit_def> <Hit_accession>WP_022666548.1</Hit_accession> <Hit_len>721</Hit_len> <Hit_hsps> <Hsp> ...
file 2:
peptide ABC transporter ATPase, partial [Kouleothrix aurantiaca] oligopeptide ABC transporter [gamma proteobacterium BDW918] ABC transporter ATP-binding protein [Desulfospira joergensenii]
output should be:
... <Hit> <Hit_num>1</Hit_num> <Hit_id>gi|939543432|gb|KPV42113.1|</Hit_id> <Hit_def>peptide ABC transporter ATPase, partial [Kouleothrix aurantiaca]</Hit_def> <Hit_accession>KPV42113.1</Hit_accession> <Hit_len>162</Hit_len> <Hit_hsps> <Hsp> ... <Hit> <Hit_num>2</Hit_num> <Hit_id>gi|385280362|gb|EIF44286.1|</Hit_id> <Hit_def>oligopeptide ABC transporter [gamma proteobacterium BDW918]</Hit_def> <Hit_accession>EIF44286.1</Hit_accession> <Hit_len>327</Hit_len> <Hit_hsps> ... <Hit> <Hit_num>3</Hit_num> <Hit_id>gi|550913550|ref|WP_022666548.1|</Hit_id> <Hit_def>ABC transporter ATP-binding protein [Desulfospira joergensenii]</Hit_def> <Hit_accession>WP_022666548.1</Hit_accession> <Hit_len>721</Hit_len> <Hit_hsps> <Hsp> ...
First trials to write a script gave no results and were disastrous. So I hope someone can help me.
No comments:
Post a Comment