cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (Commented) (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-3748) Range ghosts don't disappear as expected and accumulate
Date Thu, 09 Feb 2012 07:31:00 GMT


Sylvain Lebresne commented on CASSANDRA-3748:

One option would be to look at the sstables themselves, using say sstable2json, but that'll
probably not be very user friendly. Another option would be to add some debug info and recompile.
> Range ghosts don't disappear as expected and accumulate
> -------------------------------------------------------
>                 Key: CASSANDRA-3748
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.3
>         Environment: Cassandra on Debian 
>            Reporter: Dominic Williams
>              Labels: compaction, ghost-row, range, remove
>             Fix For: 1.0.8
>   Original Estimate: 6h
>  Remaining Estimate: 6h
> I have a problem where range ghosts are accumulating and cannot be removed by reducing
GCSeconds and compacting.
> In our system, we have some cfs that represent "markets" where each row represents an
item. Once an item is sold, it is removed from the market by passing its key to remove().
> The problem, which was hidden for some time by caching, is appearing on read. Every few
seconds our system collates a random sample from each cf/market by choosing a random starting
> String startKey = RNG.nextUUID())
> and then loading a page range of rows, specifying the key range as:
> KeyRange keyRange = new KeyRange(pageSize);
> keyRange.setStart_key(startKey);
> keyRange.setEnd_key(maxKey);
> The returned rows are iterated over, and ghosts ignored. If insufficient rows are obtained,
the process is repeated using the key of the last row as the starting key (or wrapping if
necessary etc).
> When performance was lagging, we did a test and found that constructing a random sample
of 40 items (rows) involved iterating over hundreds of thousands of ghost rows. 
> Our first attempt to deal with this was to halve our GCGraceSeconds and then perform
major compactions. However, this had no effect on the number of ghost rows being returned.
Furthermore, on examination it seems clear that the number of ghost rows being created within
GCSeconds window must be smaller than the number being returned. Thus looks like a bug.
> We are using Cassandra 1.0.3 with Sylain's patch from CASSANDRA-3510

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message