开发者

C# htmlagilitypack the operation has timed out

开发者 https://www.devze.com 2023-03-24 11:42 出处:网络
How do you increase the timeout value for htmlagiliypack?I\'m getting this error alot but I want to increase the timeout limit, or how do you kill the request and try again?

How do you increase the timeout value for htmlagiliypack? I'm getting this error alot but I want to increase the timeout limit, or how do you kill the request and try again?

resultingHTML = null;
        try
        {
            string htmlstring = string.Empty;
            HttpWebRequest newwebRequest = (HttpWebRequest)WebRequest.Create(htmlURL);
            HttpWebRespon开发者_Python百科se mywebResponce = (HttpWebResponse)newwebRequest.GetResponse();
            if (mywebResponce.StatusCode == HttpStatusCode.OK)
            {
                Stream ReceiveStream = mywebResponce.GetResponseStream();
                using (StreamReader reader = new StreamReader(ReceiveStream))
                {
                    htmlstring = reader.ReadToEnd();
                }
                HtmlDocument doc = new HtmlDocument();
                doc.Load(htmlstring);
                HtmlWeb hwObject = new HtmlWeb();
                HtmlNode body = doc.DocumentNode.SelectSingleNode("//body");
                resultingHTML = body.InnerHtml.ToString();
            }

        }


I assume you're using HtmlAgility pack to read HTML via a web request here?

I would advise using the framework WebRequest object instead,

http://msdn.microsoft.com/en-us/library/system.net.webrequest.getresponse.aspx#Y700

..where you can specify a timeout. You catch timeout (and other connection errors) just by wrapping in a try/catch block.

Then parse the resulting HTML from the WebResponse object via HtmlAgility directly.

Here is an example of how to get the html from the WebResponse

http://msdn.microsoft.com/en-us/library/system.net.webresponse.getresponsestream.aspx

Once you have the html as a string from the WebResponse you would:

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);


 HttpWebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create("wwww.someurl.com");
        httpWebRequest.Timeout = 10000; // 10 second timeout
        using(HttpWebResponse httpWebResponse = (HttpWebResponse)httpWebRequest.GetResponse())
        {
            if (httpWebResponse.StatusCode == HttpStatusCode.OK)
            {
                using(Stream responseStream = httpWebResponse.GetResponseStream())
                {
                    using (StreamReader reader = new StreamReader(responseStream))
                    {
                        var htmlstring = reader.ReadToEnd();
                         HtmlDocument doc = new HtmlDocument();
                         doc.Load(htmlstring);
                    }
                }

            }
        }

I would also look at: Adjusting HttpWebRequest Connection Timeout in C#

Just to understand the difference bettween TimeOut and ReadWriteTimeout on the HttpWebRequest class.

0

精彩评论

暂无评论...
验证码 换一张
取 消