开发者

How is web browser search implemented?

开发者 https://www.devze.com 2023-01-16 06:21 出处:网络
I want to implement in desktop appli开发者_运维技巧cation in java searching and highlighting multiple phrases in html files, like it is done in web browsers, so html tags (within < and >) are ig

I want to implement in desktop appli开发者_运维技巧cation in java searching and highlighting multiple phrases in html files, like it is done in web browsers, so html tags (within < and >) are ignored but some tags like <b> arent ignored. When searching for example each table in text ...each <b>table</b> has name... will be highlighted, but in text ...has each</p><p> Table is... it will be not highlighted, because the <p> tag interrupts the text meaning.

in web browser is this somehow implemented, how can I get to this implementation? or is there some source on the net? I tried google, but without success :(


Instead of searching inside the actual HTML file the browsers search on the rendered output of that HTML.

Get a suitable HTML renderer and get its output as text. Then search on that text output using appropriate string searching algorithms.

The example that you highlighted in your question would result in a newline character in the rendered HTML output and hence a normal string searching algorithm will behave as you expect.


As Faisal stated, browsers search in rendered content only. For doing so you'll need to remove the HTML tags before doing the actual search:

This code might help you: http://www.dotnetperls.com/remove-html-tags

Of course you'll need to add some checks/exclusions like script tags and other things that are not rendered into the browser.


This seems pretty easy.

1) Search for the last word in the string. 2) Look at what's before the last word. 3) Decide if what's before the last word constitutes and interruption (<p>, <br />, <div>). 4) If interruption, continue 5) Else evaluate previous word against the search query.

I don't know if this is how browsers perform this operation, but this approach should work.


Try using javax.swing.text.html package in java.

0

精彩评论

暂无评论...
验证码 换一张
取 消