extract an XML schema (or equivalent) from a large set of xml files



I have 46000 xml files that all share a common structure but there are variations and the number of files makes it impossible to figure out a common xml schema for all of them. Is there a tool that may help me to extract a schema from all these files or at least something that may give me a close enough idea of what is mandatory and what is optional?


Preferably, of course, a schema according to some standard or a DTD.


And, as I work entirely in Linux, a Linux tool or a program that works in Linux is OK. I am quite fluent in C, Java, Javascript, Groovy, Python (2.7 and 3) and a number of other languages.


No comments:

Post a Comment