incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stanley Xu <wenhao...@gmail.com>
Subject How to configure Cassandra params to handle heavy write, light read, short TTL scenario
Date Thu, 18 Apr 2013 12:22:52 GMT
Dear buddies,

We are using Cassandra to handle a tech scenario like the following:

1. A table using a Long as Key, and has one and only one Integer as a
ColumnFamily, with 2 hours as the TTL.
2. The wps(write per second) is 45000, the qps(read per second) would be
about 30 - 200.
3. There isn't a "hot zone" for read(which means each key query would be a
different key), but most of the reads will hit the writes in the last 30
minutes
4. All writes are new key with new value, no overwrite.


We were using Cassandra for this with 40 QPS of read before, but once the
QPS to read increase, it looks the IO_WAIT of the system increase heavily
and we got a lot of timeout in query(we set 10ms as the timeout).

Per my understand, the main reason is that most of the queries will hit the
disk with our configuration.

I am wondering if following things will help us to handle the load.

1. Increase the size of mem_table, so most of the read will read from
mem_table, and since the mem_table hasn't been flushed to disk yet, a query
to the sstable will be filtered by bloomfilter, so no disk seek will happen.

But our major concern is that once a large mem_table is flushed to the
disk, then the new incoming queries will all went to disk and the timeout
crash will still happen.

Is that possible that we could make some configuration, so there will be
like a mem_table queue in the memory, like there are 4 mem_tables in the
memory, from mem1, mem2, mem3, mem4 based on time series, and the Cassandra
will flush mem1, and once there is a mem5 is full, it will flush the mem2.
Is that possible?


Best wishes,
Stanley Xu
Best wishes,
Stanley Xu

Mime
View raw message