If you have a long lived row with a lot of tombstones or overwrites, it's often more efficient to select a known list of columns. There are short circuits in the read path that can avoid older tombstones filled fragments of the row being read. (Obviously this is hard to do if you don't know the names of the columns).


Aaron Morton
Freelance Developer

On 11/11/2012, at 10:51 PM, André Cruz <andre.cruz@co.sapo.pt> wrote:

On Nov 11, 2012, at 12:01 AM, Binh Nguyen <binhnv80@gmail.com> wrote:

FYI: Repair does not remove tombstones. To remove tombstones you need to run compaction.
If you have a lot of data then make sure you run compaction on all nodes before running repair. We had a big trouble with our system regarding tombstone and it took us long time to figure out the reason. It turned out that repair process also transfers TTLed data (compaction is not triggered yet) to the other nodes even that data was removed from the other nodes in the compaction phase before that.

Aren't compactions triggered automatically? At least minor compactions. Also, I read this in http://www.datastax.com/docs/1.1/operations/tuning#tuning-compaction :

"After running a major compaction, automatic minor compactions are no longer triggered, frequently requiring you to manually run major compactions on a routine basis."
"DataStax does not recommend major compaction."

So I'm unsure whether to start triggering manually these compactions… I guess I'll have to experiment with it.