开发者

How to parse following String present in HTML and build DOM Tree in Java?

开发者 https://www.devze.com 2023-01-31 09:34 出处:网络
I have below string in html and I want to build Dom tree and get name value pair. How i can do this using html parser or xml parser or REGEXP. any code snippet will be useful. Thanks

I have below string in html and I want to build Dom tree and get name value pair. How i can do this using html parser or xml parser or REGEXP. any code snippet will be useful. Thanks



<$$TagStarts>

<==0>Name0</==0><##0>Value0</##0>
开发者_JAVA技巧<==1>Name1</==1><##1>Value1</##1>
<==2>Name2</==2><##2>Value2</##2>
<==3>Name3</==3><##3>Value3</##3>
<==4>Name4</==4><##4>Value4</##4>
<==5>Name5</==5><##5>Value5</##5>

</$$TagStarts>



Assuming the tag names are just for sample.... and you will have some meaningful tag names...

Try using any of the following HTML parsers...

http://home.ccil.org/~cowan/XML/tagsoup/

http://nekohtml.sourceforge.net/

http://jtidy.sourceforge.net/

They will give you the W3 compliant document object.... After this it is just a game of getElementsByTagName or getElementById or Use XPath or Xquery to get the elements from the DOM.

Otherwise you can use the following... They have their own document object implementation...

http://htmlcleaner.sourceforge.net/ [It also has some basic XPath support]

http://jsoup.org/ [It has jquery like query API]

ADD Check this... http://jsoup.org/cookbook/extracting-data/selector-syntax

I will recommend ... Either JSoup or Nekohtml

0

精彩评论

暂无评论...
验证码 换一张
取 消