I have inherited a codebase that uses Neo4J v1.9.4 cyphers to extract data from XML files and I have a question relating to the merging of specific data fields from within the XML data structure.
I am quite new to this area so I appologise if this question has been asked before, I have been unable to find anything similar whilst looking on StackOverFlow.
I have the following XML data structure:
<description type="1">
<narrative>
General activity description text. Long description of the activity with no particular structure.
</narrative>
<narrative xml:lang="fr">
Activité générale du texte de description. Longue description de l'activité sans structure particulière.
</narrative>
</description>
<description type="2">
<narrative>
Objectives for the activity, for example from a logical framework.
</narrative>
<narrative xml:lang="fr">
Objectifs de l'activité, par exemple à partir d'un cadre logique.
</narrative>
</description>
<description type="3">
<narrative>
Statement of groups targeted to benefit from the activity.
</narrative>
<narrative xml:lang="fr">
Déclaration de groupes ciblés pour bénéficier de l'activité.
</narrative>
</description>
I would like to create a cypher that will return the contents of the three english language narrative tags, that are held within the description tags, as one combined value - the returned data should look like this:
General activity description text. Long description of the activity with no particular structure. Objectives for the activity, for example from a logical framework. Statement of groups targeted to benefit from the activity.
A further complication is that the Cypher will also have to cope with the situation where there are fewer than 3 description tags. The edge case example (purely in terms of data structure, it may be fairly common) would be as follows:
<description>
<narrative>
General activity description text. Long description of the activity with no particular structure.
</narrative>
<narrative xml:lang="fr">
Activité générale du texte de description. Longue description de l'activité sans structure particulière.
</narrative>
</description>
*Note the type=1 attribute is not manditory when there is only one description tag.
From this XML data structure, the output from the Cypher would need to be: General activity description text. Long description of the activity with no particular
One final point to note is that the english language narrative XML tags may or may not contain the xml:lang="en" attribute.
It may be better to assume for now that I would simply like to extract the contents from the first narrative tag that appears within the description structure that is either missing the languatge attribute altogether or where it contains the value xml:lang="en".
We are slightly stuck with version 1.9.4 of Neo4J at the moment, so updgrading the database to get functionality from later versions of the product is not really an option.
Thank you in advance for any assistance that you are able to provide.
No comments:
Post a Comment