cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <>
Subject Re: Cassandra paging, gathering stats
Date Mon, 22 Feb 2010 20:07:56 GMT
On Mon, Feb 22, 2010 at 1:40 PM, Sonny Heer <> wrote:
> Hey,
> We are in the process of implementing a cassandra application service.
> we have already ingested TB of data using the cassandra bulk loader
> (StorageService).
> One of the requirements is to get a data explosion factor as a result of
> denormalization.  Since the writes are going to the memory tables, I'm not
> sure how I could grab stats.  I cant get size of data before ingest since
> some of the data may be duplicated.

Easiest way: write some known amount of data, then use nodeprobe flush
to force it to disk.  df can tell you how much data is used, no need
to get fancy.

2nd easiest: hack your client to record how much data it is sending over.

> I was wondering if you knew of any way to do paging over all the keys for a
> given Column family.  Or perhaps how I can read from the mem table.  I tried
> the following ...  I'm getting 0 bytes each time.

You're using SSTableReader locally?

There won't be any sstables until either a memtable fills up and
flushes on its own, or you use nodeprobe flush as described above.


View raw message