Preparing XML data for Apriori algorithm



Generally, Apriori algorithm receives input in form of matrix, as in follows:



TID A B C D E
T1 1 1 1 0 0
T2 1 1 1 1 1
T3 1 0 1 1 0
T4 1 0 1 1 1
T5 1 1 1 1 0


While, my input is XML data in general form of :



<article key="tr/gte/TR-0263-08-94-165">
<author>Frank Manola</author>
<title>An Evaluation of Object-Oriented DBMS Developments: 1994 Edition.</title>
<journal>GTE Laboratories Incorporated</journal>
<volume>TR-0263-08-94-165</volume>
<month>August</month>
<year>1994</year>
</article>


How I could convert such data to a suitable form to be acceptable by the algorithm? Any suggestion.


Thanks


No comments:

Post a Comment