How do I generate more descriptive filenames for an XML file I'm splitting into pieces?



I have a pair of large/long XML files that I'm using nawk to break apart, so that I can work more easily with the pieces that are actually relevant to my project. The code I have is doing what I want, but it's producing files that lack descriptive filenames, which makes it much more time consuming for me to identify which of the child XML files correspond to the data I want to work with. Here is what I have now:


First XML file source


Code that's splitting this file apart:



nawk ' {print > "kingresult"(NR%1?i:i++)".txt"; }' i=1 PI.txt


Second XML file source


Code that's splitting this file apart:



nawk -v RS="</?Results>" -v FS="<Result>" '{ for(N=1; N<=NF; N++) if($N ~ /<[/]/) print FS $N > "stateresult00"++C".xml" }' 20140805_AllState.xml


The first XML file is being split on a line-by-line basis; the second is being split apart wherever nawk finds a new "Result" element. In both cases, however, the resulting filenames look like this:


result1.xml result2.xml result3.xml


... and so on.


It would save a lot of time if the filenames were more descriptive, and looked like this:


result1-John.xml result2-Jane.xml result3-Jake.xml


In the case of the first file, it would be acceptable if only the first word of the line were incorporated into the filename.


In the case of the second XML file, it would be ideal if the first word in the < CandidateName > element could be added to the filename. How do I go about modifying my code to get nawk to create more descriptive filenames?


1 comment:

  1. Can some one provide me the contact details of the admin of this blog?

    ReplyDelete