Wednesday, 10 December 2014

XML storage stack and database management options



What is a good workflow and technology stack for handling an XML feed where you will(Particualrly for me it is the middle stage point b) that I am swamped with options and lost in the forest):


I am particularly interested in what stack has worked for you and why.



a) filter the data to create a csv of subset for stats analysis.


b) Store the data


c) Present a mix of the XML data and returned statistical data.



Considerations per step:



a) In this stage xquey to me will likely be used regardless of other technologies. This isn't particularly difficult.


b) Store XML in an XML database, or convert to NOSQL or convert to Relational SQL.


If using XML such as BaseX or exist-db I perceive the advantage is that it maps quickly and directly. BaseX has a nice gui which allows you to view and examine data easily.


If using Mongo NOSQL, or Postgres for SQL what are the advantages and what is the best way to map to them?


Where do technologies such as Jaxb and Pyxb play a part in this and is it useful? It seems there role is that they create a class based structure from an xml file, but they are not an ORM. So I would have created a class and structure and then need to repeat this into a database or into an ORM such as SQLAclhemy.



It is this middlw section of the stack that has me reviewing a wide array of options. What is a good safe and reliable course to steer through this?



c) this stage also seems pretty self explanatory use xslt to create a view of returned data from xml and present graphical data using ggplot or plotly or google charts in whatever web format desired.



References


Xmlpipe


Papyrus


Reflections


XMLgrid


Xquery


No comments:

Post a Comment