Dissecting the Disruptor: Demystifying Memory Barriers

The RingBuffer cursor is one of these magic volatile thingies, and it’s one of the reasons we can get away with implementing the Disruptor without locking.

The Producer will obtain the next Entry (or batch of them) and do whatever it needs to do to the entries, updating them with whatever values it wants to place in there. As you know, at the end of all the changes the producer calls the commit method on the ring buffer, which updates the sequence number. This write of the volatile field (cursor) creates a memory barrier which ultimately brings all the caches up to date (or at least invalidates them accordingly).

At this point, the consumers can get the updated sequence number (8), and because the memory barrier also guarantees the ordering of the instructions that happened before then, the consumers can be confident that all changes the producer did to to the Entry at position 7 are also available.

…and on the Consumer side?

The sequence number on the Consumer is volatile, and read by a number of external objects - other downstream consumers might be tracking this consumer, and the ProducerBarrier/RingBuffer (depending on whether you’re looking at older or newer code) tracks it to make sure the ring doesn’t wrap.

So, if your downstream consumer (C2) sees that an earlier consumer (C1) reaches number 12, when C2 reads entries up to 12 from the ring buffer it will get all updates C1 made to the entries before it updated its sequence number.

Basically everything that happens after C2 gets the updated sequence number (shown in blue above) must occur after everything C1 did to the ring buffer before updating its sequence number (shown in black).

Impact on performance

Memory barriers, being another CPU-level instruction, don’t have the same cost as locks - the kernel isn’t interfering and arbitrating between multiple threads. But nothing comes for free. Memory barriers do have a cost - the compiler/CPU cannot re-order instructions, which could potentially lead to not using the CPU as efficiently as possible, and refreshing the caches obviously has a performance impact. So don’t think that using volatile instead of locking will get you away scot free.

You’ll notice that the Disruptor implementation tries to read from and write to the sequence number as infrequently as possible. Every read or write of a volatile field is a relatively costly operation. However, recognising this also plays in quite nicely with batching behaviour - if you know you shouldn’t read from or write to the sequences too frequently, it makes sense to grab a whole batch of Entries and process them before updating the sequence number, both on the Producer and Consumer side. Here’s an example from BatchConsumer:

long nextSequence = sequence + 1;
while (running)
{
    try 
    {
        final long availableSequence = consumerBarrier.waitFor(nextSequence);
        while (nextSequence <= availableSequence)
        {
            entry = consumerBarrier.getEntry(nextSequence);
            handler.onAvailable(entry);
            nextSequence++;
        }
        handler.onEndOfBatch();
        sequence = entry.getSequence();
    }
    ...
    catch (final Exception ex)
    {
        exceptionHandler.handle(ex, entry);
        sequence = entry.getSequence();
        nextSequence = entry.getSequence() + 1;
    }
}

(You’ll note this is the “old” code and naming conventions, because this is inline with my previous blog posts, I thought it was slightly less confusing than switching straight to the new conventions).

In the code above, we use a local variable to increment during our loop over the entries the consumer is processing. This means we read from and write to the volatile sequence field (shown in bold) as infrequently as we can get away with.

In Summary

Memory barriers are CPU instructions that allow you to make certain assumptions about when data will be visible to other processes. In Java, you implement them with the volatile keyword. Using volatile means you don’t necessarily have to add locks willy nilly, and will give you performance improvements over using them. However you need to think a little more carefully about your design, in particular how frequently you use volatile fields, and how frequently you read and write them.

PS Given that the New World Order in the Disruptor uses totally different naming conventions now to everything I’ve blogged about so far, I guess the next post is mapping the old world to the new one.

comments powered by Disqus