I'm confused about how to get XQuery to handle whitespace like I want it to. Say I have to following XML:
<body>
to<lb/>
<choice norm="Miss">Mi<glyph ref="#sm-long-s>s</glyph>s</choice>
<name type="person"><forename>Margaret</forename> <surname>Hamilton</surname></name><lb />
<name type="place">S<hi rend="superscript">t</hi> James's</name>
</body>
If I use this code
for $body in /body
return replace(string-join(
for $t in $body//node()
return
typeswitch($t)
case text() return
if (
sum(
for $a in $t/ancestor::*
return
typeswitch($a)
case element(choice) return 1
default return 0
)=0
) then $t
else null
case element(lb) return ' '
case element(choice) return $t/@norm
default return null
),"\s+"," ")
I get the following output:
to MissMargaretHamilton St James's
rather than the expected
to Miss Margaret Hamilton St James's
Is there a way to fix that?
PS: There is no such thing as <forename>
in the actual code, but I introduced it in this example to showcase both the linebreak and the space between > and < being ignored.
No comments:
Post a Comment