开发者

Crash course in simple threading?

开发者 https://www.devze.com 2023-03-10 19:23 出处:网络
I am testing out the idea of threads, but only in very key spots right now.Threads add a pretty fascinating level of complexity to just about anything, but with .NET, it seems there are many choices f

I am testing out the idea of threads, but only in very key spots right now. Threads add a pretty fascinating level of complexity to just about anything, but with .NET, it seems there are many choices for threads within System.Threading. I'm looking to know which is the best for handing string operations.

Consider a complex string being fed to a custom object. That object currently splits the string at some point, and feeds part one to a function, then when that function completes, feeds the other half of the string to a second function. The two functions have no dependencies on each other, so should be good candidates for threading so that both functions can work concurrently on each piece of the string.

Example before theading:

Public Sub ParseString(ByVal SomeStr As String)
    If String.IsNullOrWhitespace(SomeStr) Then
        Throw New ArgumentNullException("SomeStr")
    End If

    ' Assume that ParsedFirstString is a boolean that is set to
    ' True if the call to ParseFirstString completes successfully.
    ' Ditto for ParsedSecondString.

    Dim MyDelimiter As Char = "|"c
    Dim SomeStrArr As String() = SomeStr.Split({MyDelimiter}, 2)

    Call Me.ParseFirstString(SomeStrArr(0))

    If Me.ParsedFirstString = False Then
        Throw New ArgumentException("Failed to parse the first part of the string.")
    End If

    Call Me.ParseSecondString(SomeStrArr(1))

    If Me.ParsedSecondString = False Then
        Throw New ArgumentException("Failed to parse the second part of the string.")
    End If
End Sub

This works fine, and testing inside a timing loop on my multicore system, I can execute it 1,000 times in ~140ms-170ms (avg ~1,200ms+ if 10,000 times). This is an acceptable speed and if I can't get threading to play nice, then I'll move on. But I tried one threading approach after looking at one threading example and an SO question on invoking a thread with parameters and wound up with code similar to the following:

Public Sub ParseString(ByVal SomeStr As String)
    If String.IsNullOrWhitespace(SomeStr) Then
        Throw New ArgumentNullException("SomeStr")
    End If

    Dim MyDelimiter As Char = "|"c
    Dim SomeStrArr As String() = SomeStr.Split({MyDelimiter}, 2)

    Dim FirstThread As New Thread(Sub() Me.ParseFirstString(SomeStrArr(0))
    Dim SecondThread As New Thread(Sub() Me.ParseSecondString(SomeStrArr(1))

    FirstThread.Priority = ThreadPriori开发者_运维问答ty.Highest
    SecondThread.Priority = ThreadPriority.Highest

    Call FirstThread.Start()
    Call SecondThread.Start()

    If Me.ParsedFirstString = False Then
        Throw New ArgumentException("Failed to parse the first part of the string.")
    End If

    If Me.ParsedSecondString = False Then
        Throw New ArgumentException("Failed to parse the second part of the string.")
    End IF
End Sub

The problem with this is parsing of either the first or second parts of the string can complete before both are done, which trips up one of the two exceptions. So I looked around further and found that I could use the Join method to wait for both threads to complete. This solves the tripping up of the exceptions, but it drastically increases the execution time. Executing the above function 1,000 times and timing it now yields an average runtime of up to ~3,700ms. It almost seems like threading is just not suitable for this kind of task.

But it appears that there are other mechanisms for threading, including ThreadPools and BackgroundWorkers. Probably others I haven't looked up yet (I just started messing with this a few hours ago).

What is the community's opinion on threading for this kind of task? What is wrong with my first attempt at threading?

FYI, I am not updating any UI components nor writing results out to any kind of storage medium.

Conclusion:

It appears my string parsing functions are a lot better than I thought. Having tried both the Parallel Class and the Task Class, if I test a 10,000 iteration loop, then single threaded, my test data comes out to about ~1,220ms-1,260ms. If I implement even the basic Parallel.Invoke() to split the parsing into two parallel threads, I pad that timing loop up to an additional ~300ms (likely due to the overhead of the anonymous delegate, but it seems that there is no way around this). This is on a Core2 Q9550 Yorkfield, not overclocked, 95W processor, for comparison.

The winning choice is to remain single-threaded for this specific area of code. Thanks to all who participated!


I suggest using TPL classes like Parallel and Task.

Whether your code benefits from parallel execution or not, you need to benchmark on particular machine and find out. This is the best approach to take. The same code can slow down execution on one machine, but expedite a lot on another. Basically depends on CPU (number of cores, hyper-threading, etc.), algorithm and number of parallel tasks.

If you use TPL your code would look as simple as:

    Call Parallel.Invoke(
        Sub()
            Me.ParseFirstString(SomeStrArr(0))
        End Sub,
        Sub()
            Me.ParseFirstString(SomeStrArr(1))
        End Sub)

I'm sorry, I'm not good at VB.NET syntax. There might be a way to make it shorter.


Threading (or parallelization in general) is beneficial when the work being executed takes up more CPU cycles than is taken up by the overhead of creating/managing/joining multiple threads.

If you're parsing code & strings are relatively simple, multithreading will actually make things slower.

Threads creation is a relatively expensive operation. .NET 4 introduced the concept of "Tasks". A Task is a block of code that you want executed in parallel with other code. The .NET Framework has lots of smarts built in to split all of your tasks up amongst an ideal number of threads (generally the same as the number of CPU cores you have) and to reuse the same threads for multiple tasks.

Tasks still have a non-trivial amount of overhead but much less so than raw threads. So in many cases Tasks will still make things slower than serial code but those cases are fewer. Without seeing your input strings and parsing methods, we can't say where your specific scenario fits on this spectrum.


To me, offloading any calculations away from the UI thread, is the best option. Especially if for any reason it could be time consuming, either in quantity of complexity, or, the mother of both.

As you found, there are a number of ways to do it. A lot depends on what kinds of information you want after/during.

While my .net choice is c#, the premise is the same.

You can use tasks, which seem to work quite nicely as you can give them options easily Parallel.Invoke allows for options

A background worker to me, unless you know exactly how many you're going to have, I found them lesser to work with, because if you used the component from the toolbox, you had to premake them, you also needed to work out if they were busy or not before sending them work to do. Personally, I wanted something I could say "Here, go do this" like a bunch of minions waiting in a queue for jobs to do.

Threads also can work, but, I found it a little more involved if (like I was) you're sending it around 10k emails, and asking it to parse them.

By the looks of it, you have the best idea, take the code you want, and then try each of the methods, see which one fits you, your way of thinking and works out in speed. If you find one is significantly slower than either any other, or the UI thread performing it, chances are there was a way to improve it.

MS of course have a bunch of examples for each kind to get you going if you are stuck on examples.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号