XML : Javascript: Getting values out of an XML file with weird formatting (Recursion?)

I'm trying to iterate through an XML file that has weird formatting (I used pdftohtml to make the xml file and the output I get is weird but it's more usable than outputting to HTML)

Here's an example:

  <text height="11" font="3">Lastname1, Firstname1</text>  <text height="11" font="3">111111-1</text>  <text height="6" font="2">random text</text>  <text height="6" font="2">random text</text>  <text height="11" font="3">Lastname2, Firstname2</text>  <text height="11" font="3">222222-2</text>  <text height="6" font="2">random text</text>  <text height="6" font="2">random text</text>  <text height="11" font="3">Lastname3, Firstname3</text>  <text height="11" font="3">name3long</text>  <text height="11" font="3">333333-3</text>  <text height="6" font="2">random text</text>  <text height="6" font="2">random text</text>  <text height="11" font="3">Lastname4, Firstname4</text>  <text height="11" font="3">444444-4</text>  <text height="11" font="3">Lastname5, Firstname5</text>  <text height="11" font="3">555555-5</text>  <text height="11" font="3">Lastname6, Firstname6</text>  <text height="11" font="3">name6long</text>  <text height="11" font="3">666666-3</text>    

To break it down. The Name block starts with the name with attributes of height: 11, font: 3 and ends with the ID that has the same attributes but it is always length: 8.

I thought recursion would solve my problem but it doesn't give me the output I want as I'm trying to get the line numbers of where the name block starts and where it ends.

Here's a sample of the code I'm using

  var txt = xml.getElementsByTagName('text');        function block(b){          var line = txt[b];          if(line.innerHTML.length == 8){              return b;          }          else{              block(b+1);          }      }          function getNameBlock(){            // Notes: Name and Employee ID has attributes of height: 11, left: 62, and font: 3          // Employee ID has always length: 8;          //          // Start value should be assigned when we hit the attributes of height: 11, left: 62, font: 3          // End value should be assigned when we hit the attributes above as well as length: 8          // Console output will be start and end values            for(var i=0;i<txt.length;i++){              var line = txt[i];              var start;              var end;              if(line.getAttribute('height') == '11' && line.getAttribute('left') == '62' && line.getAttribute('font') == 3){                  start = i;                  end = block(start)                    console.log("Start: "+start+" End: "+end);                }            }      }    

My output isn't working the way I want it to because it gives me:

  Start: 0 End: undefined  Start: 1 End: 1  Start: 4 End: undefined  Start: 5 End: 5  etc....    

Am I just trying to complicate things with recursion?

No comments:

Post a Comment