开发者

Select elements with unique values

开发者 https://www.devze.com 2023-03-16 22:02 出处:网络
I\'m trying to parse an OpenOffice spreadsheet to obtain rows with unique values in the first column.

I'm trying to parse an OpenOffice spreadsheet to obtain rows with unique values in the first column.

I.E., I would like to retrieve from the following XML fragment all <table:table-row> elements with unique <text:p> values in the first child <table:table-cell>.

    <table:table table:name="foo">
        <table:table-row>
            <table:table-cell>
                <text:p>1</text:p>
            </table:table-cell>
            <table:table-cell>
                <text:p>foo</text:p>
            </table:table-cell>
        </table:table-row>
        <table:table-row>
            <table:table-cell>
                <text:p>2</text:p>
            </table:table-cell>
            <table:table-cell>
                <text:p>bar</text:p>
            </table:table-cell>
        </table:table-row>
        <table:table-row>
            <table:table-cell>
                <text:p>1</text:p>
            </table:table-cell>
            <table:table-cell>
                <text:p>baz</text:p>
            </table:table-cell>
        </table:table-row>
    </table:table>

I'll like to get the below output as Nodes

        <table:table-row>
            <table:table-cell>
                <text:p>1</text:p>
            </table:table-c开发者_开发百科ell>
            <table:table-cell>
                <text:p>foo</text:p>
            </table:table-cell>
        </table:table-row>
        <table:table-row>
            <table:table-cell>
                <text:p>2</text:p>
            </table:table-cell>
            <table:table-cell>
                <text:p>bar</text:p>
            </table:table-cell>
        </table:table-row>

How can I do this with XPath?


This XPath produces desired output: /table:table/table:table-row[not(./table:table-cell[1]/text:p/text() = preceding-sibling::table:table-row/table:table-cell[1]/text:p/text())]


Pure XPath should be:

 /table:table/table:*[not(
  .//text:p[1]
   = preceding-sibling::table:table-row//text:p[1]
 )]

If with expected output you mean a sequence of table:row nodes and not an xml document as someone correctly notice in the comments.

 /table:table/table:*[not(
  ./table:*[1]//text:*[1]
   = preceding-sibling::table:*/table:*[1]/text:*[1]
 )]
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号