XML : How to query xml tree based on another xml decision tree?

I am extremely new at working with XML and I'm trying to figure out how to query a tree based on another tree. Basically, I have two XML files.

The beginning of the first file:

      <rootNode>      <splitNode leftSplit="Local-gov,Federal-gov,State-gov" rightSplit="Self-                emp-inc,Private,Self-emp-not-inc,?" splitAttr="workclass" splitType="discrete">      <splitNode splitAttr="hours-per-week" splitType="continuous"         splitVal="39.53">      <leafNode incomeLevel="<=50K">Leaf</leafNode>      <splitNode leftSplit="10th,Assoc-voc,Some-college,Masters,7th-8th"         rightSplit="HS-grad,Bachelors,9th,12th,Assoc-acdm" splitAttr="education"         splitType="discrete">      <splitNode leftSplit="United-States" rightSplit="?" splitAttr="native-country" splitType="discrete">      <splitNode splitAttr="capital-gain" splitType="continuous" splitVal="1554.0">      <splitNode splitAttr="education-num" splitType="continuous" splitVal="9.55">      <leafNode incomeLevel="<=50K">Leaf</leafNode>      <splitNode splitAttr="education-num" splitType="continuous"                 splitVal="10.56">      <splitNode splitAttr="hours-per-week" splitType="continuous" splitVal="41.0">      <leafNode incomeLevel="<=50K">Leaf</leafNode>      <splitNode splitAttr="hours-per-week" splitType="continuous" splitVal="43.5">      <leafNode incomeLevel=">50K">Leaf</leafNode>      <leafNode incomeLevel="<=50K">Leaf</leafNode>      </splitNode>      </splitNode>      <splitNode splitAttr="education-num" splitType="continuous" splitVal="12.5">      <leafNode incomeLevel=">50K">Leaf</leafNode>      <leafNode incomeLevel="<=50K">Leaf</leafNode>      </splitNode>      </splitNode>      </splitNode>      <leafNode incomeLevel=">50K">Leaf</leafNode>      </splitNode>      <leafNode incomeLevel="<=50K">Leaf</leafNode>      </splitNode>    

The beginning of the second file:

      <People>      <Person age="50" capital-gain="0" capital-loss="0" education="Bachelors" education-num="13" fnlwgt="83311" hours-per-week="13" income-level="<=50K" marital-status="Married-civ-spouse" native-country="United-States" occupation="Exec-managerial" race="White" relationship="Husband" sex="Male"         workclass="Self-emp-not-inc"/>      <Person age="53" capital-gain="0" capital-loss="0" education="11th" education-num="7" fnlwgt="234721" hours-per-week="40" income-level="<=50K" marital-status="Married-civ-spouse" native-country="United-States" occupation="Handlers-cleaners" race="Black" relationship="Husband" sex="Male" workclass="Private"/>      <Person age="31" capital-gain="14084" capital-loss="0" education="Masters" education-num="14" fnlwgt="45781" hours-per-week="50" income-level=">50K" marital-status="Never-married" native-country="United-States" occupation="Prof-specialty" race="White" relationship="Not-in-family" sex="Female" workclass="Private"/>      <Person age="30" capital-gain="0" capital-loss="0" education="Bachelors" education-num="13" fnlwgt="141297" hours-per-week="40" income-level=">50K" marital-status="Married-civ-spouse" native-country="India" occupation="Prof-specialty" race="Asian-Pac-Islander" relationship="Husband" sex="Male" workclass="State-gov"/    

What I have to do is query the tree for each person in the second file, based on the split nodes of the first. So I would start at the first split node of the decision tree (the first file), consider this split node's split attribute and split value to determine whether to go to the left child node or the right child node. I understand the concept of this but I have no idea how to implement it. All I have right now is the code to get the root of both files.

      tree = etree.parse(fileName)      root = tree.getroot()    

Any help that you guys could give would be greatly appreciated!!

No comments:

Post a Comment