开发者

Find InnerHtml value using XpathNavigator and HtmlAgilityPack

开发者 https://www.devze.com 2023-03-30 05:19 出处:网络
portion of test.xml <tr class=\"a\"> <td align=\"left\" nowrap=\"true\">desc1</td> <td align=\"left\">desc2</td>

portion of test.xml

<tr class="a"> 
    <td align="left" nowrap="true">desc1</td> 
    <td align="left">desc2</td>  
    <td>desc3</td>  
    <td align="left">desc4</td> 
    <td align="left">desc5</td>
    <td>desc6</td> 
    <td>desc7</td> 
    <td>desc8</td>
    <td class="nr">desc9</td>
</tr>

//create XpathNavigator to get the last value inside td i.e. desc9

> HtmlDocument document = new HtmlDocument();
        document.Load(Server.MapPath("test.xml"));

        XPathNavigator xPathNavigator = document.CreateNavigator();
        object o = xPathNavigator.Evaluate("/table[1]/tbody[1]/tr[2]/td[9]");

The debugger shows the value can be evaluated as below which开发者_如何学Go is very cumbersome.

((HtmlAgilityPack.HtmlNodeNavigator)((new System.Linq.SystemCore_EnumerableDebugView(((MS.Internal.Xml.XPath.XPathSelectionIterator)(o)))).Items[0])).Value

What is the best way to get to desc9?


I haven't used the XPathNavigator but here is a similar solution with the SelectNodes/SelectSingleNode style and the HTML Agility Pack.

string xPathSearch = "/table[1]/tbody[1]/tr[2]";
HtmlNode tableRow = doc.DocumentNode.SelectSingleNode(xPathSearch);
string description9 = tableRow.ChildNodes[9].InnerText;

OR

string xPathSearch = "/table[1]/tbody[1]/tr[2]/td[9]";
HtmlNode tableColumn = doc.DocumentNode.SelectSingleNode(xPathSearch);
string description9 = tableColumn.InnerText;

FYI - The best documentation on the HTML Agility pack seems to be the samples included with the Source. Not sure why that isn't a separate download in the documentation.


something like this:

/table[1]/tbody[1]/tr[@class="a"]/td[last()]

Take a look at XPath Syntax


I think you are going about this wrong.

I believe all you should need to do is something along the lines of:

document.DocumentNode.SelectSingleNode("/table[1]/tbody[1]/tr[2]/td[9]");

I can't find an online copy of the docs to link you to but you can check the docs found at http://htmlagilitypack.codeplex.com/releases/view/44954 for more details.

Also if you are just reading XML is there any reason why you are using the html agility pack or is it just your test file that is valid XML?

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号