开发者

XSLT on docx for merging adjacent elements

开发者 https://www.devze.com 2023-03-31 06:50 出处:网络
I have a set of interview transcripts in MS Word docx format, which I want to convert to my own custom xml schema:

I have a set of interview transcripts in MS Word docx format, which I want to convert to my own custom xml schema:

A paragraph in my word doc looks like this:

Jon: This is my interview. Now I am shouting Now I am speaking normally again.

and in my custom schema should look like this:

<para speaker="jon">
    <content>This is my interview.</content>
    <content emphasis="true">Now I am shouting!</content>
    <content>Now I am speaking normally again.</content>
</para>

In the docx xml, I want adjacent w:r elements to be merged into a single element in all other cases.

Any hel开发者_开发知识库p would be much appreciated.

Thanks

Swami


Your example doesn't really match your question, but to answer the question "how to merge adjacent elements w/xslt", using your example w:r, and assuming the "w" namespace prefix is already declared in scope:

<xsl:template match="w:r[1]">
  <w:r>
    <xsl:copy-of select="@*|node()" />
    <xsl:copy-of select="following-sibling::w:r/node()" />
    <!-- assuming you don't care about attributes on adjacent w:r elements -->
  </w:r>
</xsl:template>

<xsl:template match="w:r" />

You can also do this w/xslt2 grouping operations, which you might want to look into if your case is more complex than this simple example.


Full code here. Thanks to MarkLogic Blog!

http://www.xqzone.com/blog/smallchanges/2007-12-18

0

精彩评论

暂无评论...
验证码 换一张
取 消