开发者

XPath/XSLT Remove Empty Tags

开发者 https://www.devze.com 2023-04-13 04:14 出处:网络
I would like to remove tags which conta开发者_运维百科in only whitespace/newline/tab chars, as below:

I would like to remove tags which conta开发者_运维百科in only whitespace/newline/tab chars, as below:

<p>    </p>

How would you do this using xpath functions and xslt templates?


This transformation (overriding the identity rule):

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match="*[not(*) and not(text()[normalize-space()])]"/>
</xsl:stylesheet>

when applied to the following XML document:

<t>
 <a>
  <b>
    <c/>
  </b>
 </a>
 <p></p>
 <p>  </p>
 <p>Text</p>
</t>

correctly produces the wanted result:

<t>
   <a>
      <b/>
   </a>
   <p>Text</p>
</t>

Remember: Using and overriding the identity rule/template is the most fundamental and powerful XSLT design pattern. It is the right choice for a variety of problems where most of the nodes are to be copied unchanged and only some specific nodes need be altered, deleted, renamed, ..., etc.

Note: @Abel in his comment recommends that some bits of this solution need to be further explained:

For the uninitiated or curious: not(*) means: not having an child element; not(text()[normalize-space()]) means: not having a text-node with non - white-space-only text.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号