incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pushpalanka Jayawardhana <pushpalankaj...@gmail.com>
Subject How to Optimizing Cassandra Updates -( Use of memtables)
Date Tue, 24 Jul 2012 16:01:11 GMT
Hi all,

I am dealing with a scenario where I receive a .csv file in every 10mins
intervals which is of average 300MB. I need to update a Cassandra cluster
according to the received data from .csv file, after some processing
functions.

Current approach is keeping a Hashmap in memory, updating it from the
processed .csv files gathering the data to be updated(This data is mostly a
update on a counter). Then periodically(let's say in 2s intervals) the
values in the Hashmap are read one by one again and updated in Cassandra.

I have tried generating sstables and loading data as batches via
sstableloader, but it is lot slower than the requirement that I need near
real time results.

Are there any hints on what I can try out? Is there any possibility to do
something like directly updating values in a memtable (Instead of using
Hashmap) and sending to Cassandra than loading via sstables?



-- 
Pushpalanka Jayawardhana

Mime
View raw message