开发者

Questions about code using Task queue for parallel web gets

开发者 https://www.devze.com 2023-03-28 03:05 出处:网络
So I\'ve got this code to drill down into a heirarchy of XML documents from a REST api.I posted earlier to get advice on how to make it recursive, then I went ahead and made it parralel.

So I've got this code to drill down into a heirarchy of XML documents from a REST api. I posted earlier to get advice on how to make it recursive, then I went ahead and made it parralel.

First, I was SHOCKED by how fast it ran - it pulled down 318 XML docs in just under 12 seconds, compared to well over 10 minutes single threaded - I really didn't expect to gain that much. Is there some catch to this, because it seems too good to be true?

Second, I suspect this code is implementing a common pattern but possibly in a non "idiomatic" way. I have kind of a "producer-consumer queue" happening, with two separate locking objects. Is there a more standard way I could have done this?

Code.

        public class ResourceGetter
        {
            public ResourceGetter(ILogger logger, string url)
            {
                this.logger = logger;
                this.rootURL = url;
            }
            public List<XDocument> GetResources()
            {
                GetResources(rootURL);
                while (NumTasks() > 0) RemoveTask().Wait();
                return resources;
            }
            void GetResources(string url)
            {
                logger.Log("Getting resources at " + url);
                AddTask(Task.Factory.StartNew(new Action(() =>
                {
                    var doc = XDocument.Parse(GetXml(url));
                    if (deserializer.CanDeserialize(doc.CreateReader()))
                    {
                        var rl = (resourceList)deserializer.Deserialize(doc.CreateReader());
                        foreach (var item in rl.resourceURL)
                        {
                            GetResources(url + item.location);
                        }
                    }
                    else
                    {
                        logger.Log("Got resource for " + url);
                        AddResrouce(doc);
                    }
                })));
            }
            object resourceLock = new object();
            List<XDocument> resources = new List<XDocument>();
            void AddResrouce(XDocument doc)
            {
                lock (resourceLock)
                {
                    logger.Log("add resource");
                    resources.Add(doc);
                }
            }
            object taskLock = new object();
            Queue<Task> tasks = new Queue<开发者_开发百科;Task>();
            void AddTask(Task task)
            {
                lock (taskLock)
                {
                    tasks.Enqueue(task);
                }
            }
            Task RemoveTask()
            {

                lock (taskLock)
                {
                    return tasks.Dequeue();
                }
            }
            int NumTasks()
            {
                lock (taskLock)
                {
                    logger.Log(tasks.Count + " tasks left");
                    return tasks.Count;
                }
            }
            ILogger logger;
            XmlSerializer deserializer = new XmlSerializer(typeof(resourceList));
            readonly string rootURL;
        }


Just offhand, I wouldn't bother with the code for managing the task list, all the locking, and the NumTasks() method. It would be simpler to just use a CountdownEvent, which is threadsafe to begin with. Just increment it when you create a new task, and decrement it when a task finishes, kind of like you are doing now but without the locking.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号