Can I switch the test and modification part in wait/signal semaphore?_问答_开发者

The classic none-busy-waiting version of wait() and signal() semaphore are implemented as below. In this verson, value can be negative.

//primitive
wait(semaphore* S)
{
    S->value--;
    if (S->value < 0)
    {
        add this process to S->list;
        block();
    }
}

//primitive
signal(semaphore* S)
{
    S->value++;
    if (S->value <= 0)
    {
        remove a process P from S->list;
        wakeup(P);
    }
}

Question: Is the following version also correct? Here I test first and modify the value. It's great if you can show me a scenario where it doesn't work.

//primitive wait().
//If (S->value > 0), the whole function is atomic
//otherise, only if(){} section is atomic
wait(semaphore* S)
{
    if (S->value <= 0)
    {
        add this process to S->list;
        block();
    }
    // here I decrement the value after the previous test and possible blocking
    S->value--;
}

//similar to wait()
signal(semaphore* S)
{
    if (S->list is not empty)
    {
        remove a process P from S->list;
        wakeup(P);
    }
    // here I increment the value afte开发者_如何学JAVAr the previous test and possible waking up
    S->value++;
}

Edit:

My motivation is to figure out whether I can use this latter version to achieve mutual exclusion, and no deadlock, no starvation.

Your modified version introduces a race condition:

Thread A: if(S->Value < 0) // Value = 1
Thread B: if(S->Value < 0) // Value = 1
Thread A: S->Value--; // Value = 0
Thread B: S->Value--; // Value = -1

Both threads have acquired a count=1 semaphore. Oops. Note that there's another problem even if they're non-preemptible (see below), but for completeness, here's a discussion on atomicity and how real locking protocols work.

When working with protocols like this, it's very important to nail down exactly what atomic primitives you are using. Atomic primitives are such that they seem to execute instantaneously, without being interleaved with any other operations. You cannot just take a big function and call it atomic; you have to make it atomic somehow, using other atomic primitives.

Most CPUs offer a primitive called 'atomic compare and exchange'. I'll abbreviate it cmpxchg from here on. The semantics are like so:

bool cmpxchg(long *ptr, long old, long new) {
    if (*ptr == old) {
        *ptr = new;
        return true;
    } else {
        return false;
    }
}

cmpxchg is not implemented with this code. It is in the CPU hardware, but behaves a bit like this, only atomically.

Now, let's add to this some additional helpful functions (built out of other primitives):

add_waitqueue(waitqueue) - Sets our process state to sleeping and adds us to a wait queue, but continues executing (ATOMIC)
schedule() - Switch threads. If we're in a sleeping state, we don't run again until awakened (BLOCKING)
remove_waitqueue(waitqueue) - removes our process from a wait queue, then sets our state to awakened if it isn't already (ATOMIC)
memory_barrier() - ensures that any reads/writes logically before this point actually are performed before this point, avoiding nasty memory ordering issues (we'll assume all other atomic primitives come with a free memory barrier, although this isn't always true) (CPU/COMPILER PRIMITIVE)

Here's how a typical semaphore acquisition routine will look. It's a bit more complex than your example, because I've explicitly nailed down what atomic operations I'm using:

void sem_down(sem *pSem)
{
    while (1) {
        long spec_count = pSem->count;
        read_memory_barrier(); // make sure spec_count doesn't start changing on us! pSem->count may keep changing though
        if (spec_count > 0)
        {
            if (cmpxchg(&pSem->count, spec_count, spec_count - 1)) // ATOMIC
                return; // got the semaphore without blocking
            else
                continue; // count is stale, try again
        } else { // semaphore count is zero
            add_waitqueue(pSem->wqueue); // ATOMIC
            // recheck the semaphore count, now that we're in the waitqueue - it may have changed
            if (pSem->count == 0) schedule(); // NOT ATOMIC
            remove_waitqueue(pSem->wqueue); // ATOMIC
            // loop around again to try to acquire the semaphore
        }
    }
}

You'll note that the actual test for a non-zero pSem->count, in a real-world semaphore_down function, is accomplished by cmpxchg. You can't trust any other read; the value can change an instant after you read the value. We simply can't separate the value check and the value modification.

The spec_count here is speculative. This is important. I'm essentially making a guess at what the count will be. It's a pretty good guess, but it's a guess. cmpxchg will fail if my guess is wrong, at which point the routine has to loop and try again. If I guess 0, then I will either be woken up (as it ceases to be zero while I'm on the waitqueue), or I will notice it's not zero anymore in the schedule test.

You should also note that there is no possible way to make a function that contains a blocking operation atomic. It's nonsensical. Atomic functions, by definition, appear to execute instantaneously, not interleaved with anything else whatsoever. But a blocking function, by definition, waits for something else to happen. This is inconsistent. Likewise, no atomic operation can be 'split up' across a blocking operation, which it is in your example.

Now, you could do away with a lot of this complexity by declaring the function non-preemptable. By using locks or other methods, you simply ensure only one thread is ever running (not including blocking of course) in the semaphore code at a time. But a problem still remains then. Start with a value of 0, where C has taken the semaphore down twice, then:

Thread A: if (S->Value < 0) // Value = 0
Thread A: Block....
Thread B: if (S->Value < 0) // Value = 0
Thread B: Block....
Thread C: S->Value++ // value = 1
Thread C: Wakeup(A)
(Thread C calls signal() again)
Thread C: S->Value++ // value = 2
Thread C: Wakeup(B)
(Thread C calls wait())
Thread C: if (S->Value <= 0) // Value = 2
Thread C: S->Value-- // Value = 1
// A and B have been woken
Thread A: S->Value-- // Value = 0
Thread B: S->Value-- // Value = -1

You could probably fix this with a loop to recheck S->value - again, assuming you are on a single processor machine and your semaphore code is preemptable. Unfortunately, these assumptions are false on all desktop OSes :)

For more discussion on how real locking protocols work, you might be interested in the paper "Fuss, Futexes and Furwocks: Fast Userlevel Locking in Linux"