I'm going to write a Web parser (an application that crawles on the web from one site to another).
How Can I find list 开发者_JAVA百科of available domains/IPs in the internet (as complete as possible)? How search engines find websites (What they use as a reliable list of registred IP/Domains for starting point)?Thanks
As Michael P's comment indicates, depends on what your objective is.
My company recently wanted to answer a question about third-party tools used on leading websites. I used Alexa as a starting point to find the top (by traffic) websites, and created a parser that can answer the specific question my company asked. If you start from such a list, you can program your web crawler to follow the links it encounters to broaden your knowledge of sites on the web.
Hopefully that helps you think about the problem.
加载中,请稍侯......
精彩评论