开发者

Using HTML Agility Pack to get text next to image?

开发者 https://www.devze.com 2023-03-10 17:15 出处:网络
I have this bit of html that I need to parse though <p class=\"feature_list\"> <img src=\"candy.gif\" alt=\"candy\" title=\"candy\"/>&nbsp;

I have this bit of html that I need to parse though

<p class="feature_list">

<img src="candy.gif" alt="candy" title="candy"/>&nbsp;
                        x 3&nbsp;&nbsp;
<img src="lollies.gif" alt="lollies" title="lollies"/>&nbsp;
                        1&nbsp;&nbsp;
<img src="system.gif" alt="system" title="system"/>&nbsp;

                        x 1&nbsp;&nbsp;
<img src="phone.gif" alt="phone" title="phone"/>&nbsp;
                        x 1&nbsp;&nbsp;
</p>

As you can see there is an image and then a text like "x 3" next to it.

What I want to do is go through each image, and record the text next to it. However, the text i开发者_StackOverflows outside the 'img' tag.

I was wondering is there anyway of doing this using the HTML agility pack?


The following code:

    HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
    doc.Load(yourHtml);

    foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//img"))
    {
        Console.WriteLine(HtmlEntity.DeEntitize(node.NextSibling.InnerText).Trim());
    }

Will output:

x 3
1
x 1
x 1

Note the HtmlEntity utility that eases the handling of HTML entities (like &nbsp;)

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号