incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <aa...@thelastpickle.com>
Subject Re: commitlog replay missing data
Date Wed, 13 Jul 2011 20:10:44 GMT
Have you verified that data you expect to see is not in the server after shutdown?

WRT the differed in the difference between the Memtable data size and SSTable live size, don't
believe everything you read :)

Memtable live size is increased by the serialised byte size of every column inserted, and
is never decremented. Deletes and overwrites will inflate this value. What was your workload
like?

As of 0.8 we now have global memory management for cf's that tracks actual JVM bytes used
by a CF. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 12/07/2011, at 3:28 PM, Jeffrey Wang <jwang@palantir.com> wrote:

> Hey all,
> 
>  
> 
> Recently upgraded to 0.8.1 and noticed what seems to be missing data after a commitlog
replay on a single-node cluster. I start the node, insert a bunch of stuff (~600MB), stop
it, and restart it. There are log messages pertaining to the commitlog replay and no errors,
but some of the data is missing. If I flush before stopping the node, everything is fine,
and running cfstats in the two cases shows different amounts of data in the SSTables. Moreover,
the amount of data that is missing is nondeterministic. Has anyone run into this? Thanks.
> 
>  
> 
> Here is the output of a side-by-side diff between cfstats outputs for a single CF before
restarting (left) and after (right). Somehow a 37MB memtable became a 2.9MB SSTable (note
the difference in write count as well)?
> 
>  
> 
> Column Family: Blocks                                           Column Family: Blocks
> 
> SSTable count: 0                              |                 SSTable count: 1
> 
> Space used (live): 0                          |                 Space used (live): 2907637
> 
> Space used (total): 0                         |                 Space used (total): 2907637
> 
> Memtable Columns Count: 8198                  |                 Memtable Columns Count:
0
> 
> Memtable Data Size: 37550510                  |                 Memtable Data Size: 0
> 
> Memtable Switch Count: 0                      |                 Memtable Switch Count:
1
> 
> Read Count: 0                                                   Read Count: 0
> 
> Read Latency: NaN ms.                                           Read Latency: NaN ms.
> 
> Write Count: 8198                             |                 Write Count: 1526
> 
> Write Latency: 0.018 ms.                      |                 Write Latency: 0.011
ms.
> 
> Pending Tasks: 0                                                Pending Tasks: 0
> 
> Key cache capacity: 200000                                      Key cache capacity: 200000
> 
> Key cache size: 0                                               Key cache size: 0
> 
> Key cache hit rate: NaN                                         Key cache hit rate: NaN
> 
> Row cache: disabled                                             Row cache: disabled
> 
> Compacted row minimum size: 0                 |                 Compacted row minimum
size: 1110
> 
> Compacted row maximum size: 0                 |                 Compacted row maximum
size: 2299
> 
> Compacted row mean size: 0                    |                 Compacted row mean size:
1960
> 
>  
> 
> Note that I patched https://issues.apache.org/jira/browse/CASSANDRA-2317 in my version,
but there are no deletions involved so I don’t think it’s relevant unless I messed something
up while patching.
> 
>  
> 
> -Jeffrey
> 

Mime
View raw message