Finding every page for a given domain_问答_开发者

开发者 https://www.devze.com 2023-03-29 18:18 出处：网络

Is there any tool/library for Ruby that, when given a domain name, will return a list of all the pa开发者_运维百科ges at that domain?You could use Anemone, it is a Ruby web spider framework. It requir

相关专题：dns ruby

Is there any tool/library for Ruby that, when given a domain name, will return a list of all the pa开发者_运维百科ges at that domain?

You could use Anemone, it is a Ruby web spider framework. It requires Nokogiri as a dependency, since it needs to parse the (X)HTML.

Enumeration is a difficult task if a site is anything other than a collection of static HTML pages. Once you get into server-side scripting of any kind, the "page" returned can rely heavily on the state of your session. An obvious example would be pages or resources only accessible after you log in. Because of this, many automated enumeration tools (usually part of web application security auditing programs) get it wrong and miss large portions of the site. My point here is that there is often more to enumeration than simply running a tool.

The good news is that it's quite easy to write your own enumerator that works well given a bit of knowledge you can obtain mostly from just poking around on a site. I wrote something similar using Mechanize, which handily tracks your history as you request pages. So it's a pretty simple task of getting Mechanize to set up the server-side state you need (namely, logging in) and then visiting every link you find. Simply request the front page, or any "list" pages that you need and keep an array of links. Iterate over this list of links and, if the link is not in the history, go to that link and store the list of links on that page. Repeat until the list of links is empty.

But like I said, it all depends on what's happening server-side. There may be pages that aren't linked to, or aren't accessible by you that you won't be able to discover this way.