开发者

Display double encoded string as HTML from XML source using XSL

开发者 https://www.devze.com 2023-04-11 00:56 出处:网络
I have a source of XML that contains content that I need to display in a web page as HTML using XSL. One of the开发者_运维知识库 XML nodes contains a double \"HTML encoded\" value. This is the one I n

I have a source of XML that contains content that I need to display in a web page as HTML using XSL. One of the开发者_运维知识库 XML nodes contains a double "HTML encoded" value. This is the one I need to output HTML for.

So the original HTML input was <p><strong>hello world</strong></p> but it is then stored as twice HTML encoded text.

  • original version: <p><strong>hello world</strong></p>
  • first HTML encoding: &lt;p&gt;&lt;strong&gt;hello world&lt;/strong&gt;&lt;/p&gt;
  • second HTML encoding: &amp;lt;p&amp;gt;&amp;lt;strong&amp;gt;hello world&amp;lt;/strong&amp;gt;&amp;lt;/p&amp;gt;

I receive only this second HTML encoding from my XML source

<CONTENT>
   <RECORD>
      <OVERVIEW>&amp;lt;p&amp;gt;&amp;lt;strong&amp;gt;hello world&amp;lt;/strong&amp;gt;&amp;lt;/p&amp;gt;</OVERVIEW>
   </RECORD>
</CONTENT>

Outputting to html in the XSL using xsl:output gets things started, and disable-output-escaping in my xsl:value-of tag gets me past one layer of HTML encoding.

But the following XSL:

<xsl:for-each select = "//CONTENT/RECORD">
   <xsl:value-of disable-output-escaping="yes" select = "OVERVIEW" />
</xsl:for-each>

Returns only:

&lt;p&gt;&lt;strong&gt;hello world&lt;/strong&gt;&lt;/p&gt;

It doesn't get me all the way back to the original input <p><strong>hello world</strong></p>

So I've been looking for a way to "double" or "disable-output-escaping="yes" twice.

Any ideas how I can do this just in XSL?


My understanding is that you cannot use disable-output-escaping twice in XSLT, this is a serialization option that does not affect the transformation process. You can annotate an output node not to be escaped when serialized, and that's it. If you need to double unescape you need to pre-process the input document, or use an extension function.


Remember that the operation that "unescapes" content is properly called parsing, and the operation that "escapes" it is called serialization. So to perform two levels of unescaping, call parse(parse(X)). Extension functions to do parse() and serialize() operations are available in some XSLT processors such as Saxon, and in others you can write your own.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号