XML : find tags inside an xml with python

I have an XML like this:

      <w:p>          <w:r>              <w:rPr />              <w:t> Description 1</w:t>          </w:r>      </w:p>      <w:p>          <w:r>              <w:rPr />              <w:t>Checkbox 1</w:t>          </w:r>          <w:r>              <w:fldChar w:fldCharType="begin">                  <w:ffData>                      <w:name w:val="" />                      <w:enabled />                      <w:calcOnExit w:val="0" />                      <w:checkBox>                          <w:sizeAuto />                          <w:checked />                      </w:checkBox>                  </w:ffData>              </w:fldChar>          </w:r>          <w:r>              <w:rPr />              <w:t> Checkbox 2</w:t>          </w:r>          <w:r>              <w:fldChar w:fldCharType="begin">                  <w:ffData>                      <w:name w:val="" />                      <w:enabled />                      <w:calcOnExit w:val="0" />                      <w:checkBox>                          <w:sizeAuto />                      </w:checkBox>                  </w:ffData>              </w:fldChar>          </w:r>      </w:p>     <w:p>          <w:r>              <w:rPr />              <w:t> Description 2</w:t>          </w:r>      </w:p>      <w:p>          <w:r>              <w:rPr />              <w:t> Description 3</w:t>          </w:r>      </w:p>  .....    

On this XML I have couples of <w:p> </w:p> There are some <w:p> Description tags that contains checkbox tag after them and some that are empty. For each I need to create a JSON object and store it in a list.

I need to find tags to take text inside <w:t> and then to continue to another <w:p> tag to see if it contains checkbox, if yes then to take <w:t> value the JSON will look like this:

  json['description'] = description  json['checkbox_text'] = checkbox    

else if the tag after Description tag contain no checkbox then the JSON will contain only one element:

   json['description'] = description    

My code looks like this:

  results = []      default_positions = [m.start() for m in re.finditer('w:p', xml_content)]          jsonobj = {}          for position in default_positions:          if .. :              //code          else:              //code    

Any help?

No comments:

Post a Comment