cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maxim Potekhin <>
Subject Mass deletion -- slowing down
Date Fri, 11 Nov 2011 01:30:32 GMT

My data load comes in batches representing one day in the life of a 
large computing facility.
I index the data by the day it was produced, to be able to quickly pull 
data for a specific day
within the last year or two. There are 6 other indexes.

When it comes to retiring the data, I intend to delete it for the oldest 
date and after that add
a fresh batch of data, so I control the disk space. Therein lies a 
problem -- and it maybe
Pycassa related, so I also filed an issue on github -- then I select by 
'DATE=blah' and then
do a batch remove, it works fine for a while, and then after a few 
thousand deletions (done
in batches of 1000) it grinds to a halt, i.e. I can no longer iterate 
the result, which manifests
in a timeout error.

Is that a behavior seen before? Cassandra version is 0.8.6, Pycassa 1.3.0.



View raw message