incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <aa...@thelastpickle.com>
Subject get_slice and deletes
Date Tue, 14 Sep 2010 02:35:40 GMT
I'm running the 0.7 nightly build from aug 31 and noticed some different performance characteristics
when using get_slice against a row that has seen a lot of deletes.

One row in the key space has around 650K columns, colums are small at around 53 bytes each
so a total of around 30MB. In the last hour or so I finished deleting around 300K columns
from the row (and another approx 1M rows from other CF's) that were ordered before those those
left in there. 

I stopped my processing restarted it and noticed that a get_slice was running significantly slower
then before. If I do a get_slice for 101 columns, no finish col name and vary the start column
I see different performance.

start="" - 5 to 6 secs
start = "excer" - 5 to 6 secs
start = "excerise-2010-08-31t17-15-57-92421646-11330" - 0.5 to 0.6 secs (this is the first
col in this row)

For comparison a get_slice against another row with 232K cols in the same keyspace, different
CF but same col size, with an empty start returned in 0.01 secs.

Could a high level of deletes on a row reduce the get_slice performance ? Is it worth forcing
the tombstones out by reducing the GCGraceSeconds and doing a compaction to see what happens
?

Thanks
Aaron




Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
    • Unnamed multipart/related (inline, None, 0 bytes)
View raw message