Normalise std:cout when the tags aren't named correctly



Problem



  • I have code to compare nodes in sets of XML files in folders

  • In each folder is an A.xml and B.xml

  • The nodes names will match for files in the same folder

  • The node names will not match for files in different folders

  • The std:cout must be consistent no matter what the input nodes are named.

  • Standard mapping is id and description


example1 folder - a.xml and b.xml comparison output



<entry>
<id><![CDATA[9]]></id>
<description><![CDATA[Dolce 27 Speed]]></description>
</entry>


example2 folder - a.xml and b.xml comparison output



<entry>
<id><![CDATA[4]]></id>
<content><![CDATA[Specialized Dolce Sport]]></content>
</entry>


The problem here is a.xml and b.xml in the example2 folder use the mapping content which is wrong.


Possible solution


Each time I run the software I define mappings for the nodes (if they are different from default) like this description = content;.


Question


How can I normalise the output of std:cout for nodes which are named different to default, renaming the "content" tag so it is printed as "description" in this example.


Existing code



#include "pugi/pugixml.hpp"

#include <iostream>
#include <string>
#include <map>

int main() {
pugi::xml_document doca, docb;
std::map<std::string, pugi::xml_node> mapa, mapb;

if (!doca.load_file("a.xml") || !docb.load_file("b.xml")) {
std::cout << "Can't find input files";
return 1;
}

for (auto& node: doca.child("jobsite_vacancies").children("job")) {
const char* id = node.child_value("id");
mapa[id] = node;
}

for (auto& node: docb.child("jobsite_vacancies").children("job")) {
const char* idcs = node.child_value("id");
if (!mapa.erase(idcs)) {
mapb[idcs] = node;
}
}

for (auto& ea: mapa) {
std::cout << "Removed:" << std::endl;
ea.second.print(std::cout);
}

for (auto& eb: mapb) {
std::cout << "Added:" << std::endl;
eb.second.print(std::cout);
}

}


I'm new to C++ so any suggestions on how to implement this would be appreciated. I have to run this on tens of thousands of fields so performance is key.


No comments:

Post a Comment