cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] [Updated] (CASSANDRA-4402) Atomicity violation bugs because of misusing concurrent collections
Date Mon, 02 Jul 2012 23:04:57 GMT


Jonathan Ellis updated CASSANDRA-4402:

    Attachment: 4402-v2.txt

Thanks for the patch, Yu.  The getExpireTimeForEndpoint is the most serious since we can fairly
easily have concurrent contains with remove calls.  maybeInitializeLocalState shouldn't be
called concurrently but it doesn't hurt to clean that up too.

I'm not sure what to do about Table.initCf, though.  It seems like CASSANDRA-2963 has made
that inherently unsafe.  A "mechanical" fix like the one here won't help since if two CFS
objects are created for the same CF, they will conflict on the mbean definition as well. 
I'll follow up on that over on the 2963 ticket.

In the meantime, v2 is attached with some cleanup.
> Atomicity violation bugs because of misusing concurrent collections
> -------------------------------------------------------------------
>                 Key: CASSANDRA-4402
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.1.1
>            Reporter: Yu Lin
>              Labels: patch
>         Attachments: 4402-v2.txt, cassandra-1.1.1-4402.txt
>   Original Estimate: 504h
>  Remaining Estimate: 504h
> My name is Yu Lin. I'm a Ph.D. student in the CS department at
> UIUC. I'm currently doing research on mining Java concurrent library
> misusages. I found some misusages of ConcurrentHashMap in Cassandra
> 1.1.1, which may result in potential atomicity violation bugs or harm
> the performance.
> The code below is a snapshot of the code in file
> src/java/org/apache/cassandra/db/ from line 348 to 369
> L348        if (columnFamilyStores.containsKey(cfId))
> L349        {
> L350            // this is the case when you reset local schema
> L351            // just reload metadata
> L352            ColumnFamilyStore cfs = columnFamilyStores.get(cfId);
> L353            assert cfs.getColumnFamilyName().equals(cfName);
>             ...
> L364        }
> L365        else
> L366        {
> L367            columnFamilyStores.put(cfId, ColumnFamilyStore.createColumnFamilyStore(this,
> L368        }
> In the code above, an atomicity violation may occur between line 348
> and 352. Suppose thread T1 executes line 348 and finds that the
> concurrent hashmap "columnFamilyStores" contains the key
> "cfId". Before thread T1 executes line 352, another thread T2 removes
> the "cfId" key from "columnFamilyStores". Now thread T1 resumes
> execution at line 352 and will get a null value for "cfs". Then the
> next line will throw a NullPointerException when invoking the method
> on "cfs".
> Second, the snapshot above has another atomicity violation. Let's look
> at lines 348 and 367. Suppose a thread T1 executes line 348 and finds
> out the concurrent hashmap does not contain the key "cfId". Before it
> gets to execute line 367, another thread T2 puts a pair <cfid, v> in
> the concurrent hashmap "columnFamilyStores". Now thread T1 resumes
> execution and it will overwrite the value written by thread T2. Thus,
> the code no longer preserves the "put-if-absent" semantics.
> I found some similar misusages in other files:
> In src/java/org/apache/cassandra/gms/, similar atomicity
> violation may occur if thread T2 puts a value to map
> "endpointStateMap" between lines <1094 and 1099>, <1173 and 1178>. Another
> atomicity violation may occur if thread T2 removes the value on key
> "endpoint" between lines <681 and 683>.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message