cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Oleg Kibirev (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-4681) SlabAllocator spends a lot of time in Thread.yield
Date Wed, 19 Sep 2012 15:02:07 GMT


Oleg Kibirev commented on CASSANDRA-4681:

I have also prototyped the newly attached version that keeps a list of extra slabs in the
event of a race condition, however it's throughput and GC performance are inferior to the
simpler fix, at least in my benchmark.

I don't think allocating separate blocks only in the event of a race condition "defeats the
purpose" of SlabAllocator. Either race condition is rare and impact on GC is minimum - after
all there are many other objects of different size, including an instance of ByteBuffer per
call, that are allocated in addition to slabs. Or the node is so congested that race condition
is the default case, in which case no solution, but especially not "while (!done) Thread.yield()"
is likely to produce good results.

It is of course entirely possible that the .list version performs better in a different benchmark.
> SlabAllocator spends a lot of time in Thread.yield
> --------------------------------------------------
>                 Key: CASSANDRA-4681
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.5
>         Environment: OEL Linux
>            Reporter: Oleg Kibirev
>         Attachments:,
> When profiling high volume inserts into Cassandra running on a host with fast SSD and
CPU, Thread.yield() invoked by SlabAllocator appeared as the top item in CPU samples. The
fix is to return a regular byte buffer if current slab is being initialized by another thread.
So instead of:
>                if (oldOffset == UNINITIALIZED)
>                 {
>                     // The region doesn't have its data allocated yet.
>                     // Since we found this in currentRegion, we know that whoever
>                     // CAS-ed it there is allocating it right now. So spin-loop
>                     // shouldn't spin long!
>                     Thread.yield();
>                     continue;
>                 }
> do:
> if (oldOffset == UNINITIALIZED)
>     return ByteBuffer.allocate(size);
> I achieved 4x speed up in my (admittedly specialized) benchmark by using an optimized
version of SlabAllocator attached. Since this code is in the critical path, even doing excessive
atomic instructions or allocating unneeded extra ByteBuffer instances has a measurable effect
on performance

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message