开发者

Does XML::LibXML::Reader read HTML?

开发者 https://www.devze.com 2022-12-28 02:26 出处:网络
I didn\'t find anything about parsing HTML in the XML::LibXML::Reader documentation. And I tried to parse a HTML-site and it didn\'t work.

I didn't find anything about parsing HTML in the XML::LibXML::Reader documentation. And I tried to parse a HTML-site and it didn't work. Is my conclusion, that XML::LibXML::Reader doesn't work with HTML right开发者_高级运维?


Unless it's really XHTML, then no. XML is much more restrictive than HTML is, and XML parsers normally can't parse HTML.

HTML::TokeParser (or its base class HTML::PullParser) are the most similar to XML::LibXML::Reader (but not all that similar).

You might want to look at HTML-Tree for something similar to LibXML that does work with HTML. There's also HTML::TreeBuilder::LibXML, which wraps an even more LibXML-compatible interface around HTML-Tree.


No, but HTML::TreeBuilder::LibXML implements a compatible interface on an HTML paser.

0

精彩评论

暂无评论...
验证码 换一张
取 消