How to parse html source code with ruby/nokogiri?_问答_开发者

How to parse html source code with ruby/nokogiri?

开发者 https://www.devze.com 2023-01-21 08:28 出处：网络

I\'ve successfully used ruby (1.8) and nokogiri\'s css parsing to pull out front facing data from web pages.

相关专题：nokogiri ruby

I've successfully used ruby (1.8) and nokogiri's css parsing to pull out front facing data from web pages.

However I now need to pull out some data from a series of pages where the data is in the "meta" tags in the source code of the page.

One of the lines I need is the following:

<meta name="geo.position" content="35.667459;139.706256" />

I've tried using xpath put haven't been able to get it right.

Any help as to what syntax is neede开发者_如何转开发d would be much appreciated.

Thanks

This is a good case for a CSS attribute selector. For example:

doc.css('meta[name="geo.position"]').each do |meta_tag|
  puts meta_tag['content'] # => 35.667459;139.706256
end

The equivalent XPath expression is almost identical:

doc.xpath('//meta[@name = "geo.position"]').each do |meta_tag|
  puts meta_tag['content'] # => 35.667459;139.706256
end

require 'nokogiri'

doc = Nokogiri::HTML('<meta name="geo.position" content="35.667459;139.706256" />')
doc.at('//meta[@name="geo.position"]')['content'] # => "35.667459;139.706256"

How to parse html source code with ruby/nokogiri?

精彩评论

关注公众号

热门标签

图文推荐

How to parse html source code with ruby/nokogiri?

更多 问答 相关资讯：

精彩评论

关注公众号

热门标签

图文推荐

更多问答相关资讯：