I'm using C#, and I'd like to scrape all the content on a site (but not the images, scripts, or files that may be attached to th开发者_如何学编程e page). How do I do that with C# and ASP.NET?
Hi you can use the following code snippet from HERE to do that:
StringBuilder sb  = new StringBuilder();
byte[]        buf = new byte[8192];
HttpWebRequest  request  = (HttpWebRequest)WebRequest.Create("http://www.your-url.com");
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Stream resStream = response.GetResponseStream();
string tempString = null;
int    count      = 0;
do
{
    count = resStream.Read(buf, 0, buf.Length);
    if (count != 0)
    {
        tempString = Encoding.ASCII.GetString(buf, 0, count);
        sb.Append(tempString);
    }
}
while (count > 0);
Console.WriteLine(sb.ToString());
You can also get the HTML at Render method of the Page as following.
protected override void Render(System.Web.UI.HtmlTextWriter writer)
        {
            StringBuilder sb = new StringBuilder();
            StringWriter sw = new StringWriter(sb);
            HtmlTextWriter writer = new HtmlTextWriter(sw);
            base.Render(writer);
            string markupText = sb.ToString();
            // markupText will contain the HTML of the Page
            writer.Write(markupText);
        }
 
         
                                         
                                         
                                         
                                        ![Interactive visualization of a graph in python [closed]](https://www.devze.com/res/2023/04-10/09/92d32fe8c0d22fb96bd6f6e8b7d1f457.gif) 
                                         
                                         
                                         
                                         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论