I'm trying to do what I thought would be a trivial transformation using XSL. The problem is very straightforward. Here's my input:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE topic PUBLIC "-//OASIS//DTD DITA Topic//EN" "topic.dtd">
<topic id="topic_stl_xnt_tq">
<title>The Torments of Hell</title>
<body>
<p>Life is a <xref href="stuff">dungeon</xref>
and an <xref href="stuff">abyss</xref>.</p>
</body>
</topic>
I am trying to get output like this:
Life is a dungeon and an abyss.
That is, all on one line, and with no extra spaces. I can't control the way my editor formats paragraphs; the line breaks are totally arbitrary in the xml. I've tried the solutions here and here, but the problem with both of them is that they only work on spaces, not on whitespace in general, such as linefeeds. My original, naive attempt looked like this:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://ift.tt/tCZ8VR"
xmlns:xs="http://ift.tt/tphNwY"
exclude-result-prefixes="xs"
version="2.0">
<xsl:template match="topic">
<xsl:value-of select="title"/>
<xsl:apply-templates select="body/p"/>
</xsl:template>
<xsl:template match="p">
<xsl:text>

</xsl:text>
<xsl:apply-templates select="node()"/>
</xsl:template>
<xsl:template match="text()">
<xsl:value-of select="normalize-space(.)"/>
</xsl:template>
<xsl:template match="xref">
<xsl:value-of select="."/>
</xsl:template>
</xsl:stylesheet>
Note the use of normalize-space
. This takes care of the linefeeds, but it is too aggressive with the embedded <xref>
tags, resulting in this output:
Life is adungeonand anabyss.
I tried simply hacking it and adding an extra spaces before an after the <xref>
, but there are plenty of them that shouldn't have a trailing space, such as those just before a period or other punctuation mark.
I am very new to XSL--I'm hoping that I'm simply missing something obvious. Can anyone help?
No comments:
Post a Comment