Some 'feature' for future implementation, maybe? 
imho truncation working as a meta data operation is the correct approach. It's generally used in testing and development. It deletes the data and removes the SSTables, giving you a clean state. 

A CF level tombstone would mean that reads had to examine every SSTable to see if they had columns with a higher timestamp. And the data would not be removed intil gc_grace_seconds had passed.   

It seems odd if you expect the same behaviour of "delete from usertable" (in SQL, not yet in CQL, I presume), especially because the truncate is synced over all nodes before it returns to client, so a truncate may rightfully discard its handoffs, right?
It is analogous to http://en.wikipedia.org/wiki/Truncate_(SQL)

A CF level tomstone would be analogous to "delete * from foo"

I wonder though that you don't know YCSB, what do you use to do performance testing? wrote your own or use another tool? In the latter I would like to know what you use :)
If I want to stress test a normal sized cluster I use the java stress tool included in the source distribution. 

Previously where I had to benchmark a system for a proof of concept and capacity planning. I cobbled together a system with python, redis and flat files to replay real life requests using multiple clients. That allowed us to adjust the read / write mix and the adjust the rate.

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton

On 7/05/2012, at 11:51 PM, Peter Dijkshoorn wrote:

Check, I understand. Thanks!

The cluster certainly was overloaded and I did not realize that truncate does not tombstone or have a timestamp. Some 'feature' for future implementation, maybe?
It seems odd if you expect the same behaviour of "delete from usertable" (in SQL, not yet in CQL, I presume), especially because the truncate is synced over all nodes before it returns to client, so a truncate may rightfully discard its handoffs, right?
btw, it was very hard to replicate this behaviour, seems to be a rare occurrence...

I wonder though that you don't know YCSB, what do you use to do performance testing? wrote your own or use another tool? In the latter I would like to know what you use :)

Ciao


Peter Dijkshoorn
Adyen - Payments Made Easy
www.adyen.com

Visiting address:    	        Mail Address:             
Simon Carmiggeltstraat 6-50    	P.O. Box 10095
1011 DJ Amsterdam               1001 EB Amsterdam
The Netherlands                 The Netherlands

Office +31.20.240.1240
Email peter.dijkshoorn@adyen.com

On 05/07/2012 12:59 PM, aaron morton wrote:
I don't know the YCSB code, but one theory would be…

1) The cluster is overloaded by the test. 
2) A write at CL ALL fails because a node does not respond in time. 
3) The coordinator stores the hint and returns failure to the client. 
4) The client gets an UnavailableException and retries the operation. 

Did the nodes show any dropped messages ? Either in nodetool tpstats or in the logs?

Truncate is meta data operation, unlike deleting columns or rows. When a column is deleted a Tombstone column is written, when row is deleted information is associated with the key, in the context of the CF. Truncate snapshots and then deletes the SSTables on disk, it does not write to the SSTables. So it is possible for a write to be stored with a lower timestamp than the truncate, because truncate does not have a timestamp. 

cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton

On 4/05/2012, at 1:28 AM, Peter Dijkshoorn wrote:

Hi guys,

I got a weird thingy popping up twice today, I run a test where I insert
a milion records via YCSB and edited it to allow me to adjust the
consistency level: the write operations are done with ConsistencyLevel.ALL.
This is send to a 4 (virtual) node cluster with a keyspace 'test' set up
with replication factor 3.
Now I expect that because of the ConsistencyLevel.ALL there is no hinted
handoff active, since writes are to be accepted by all nodes before the
operation returns to the client. The client gets only OK back, none fails.
After the test I run a truncate, and a count which reveals still active
records, time does not matter, I have to re-invoke the truncate to
remove the records.

[cqlsh 2.0.0 | Cassandra 1.0.8 | CQL spec 2.0.0 | Thrift protocol 19.20.0]
cqlsh> use test;
cqlsh:test> truncate usertable;
cqlsh:test> select count(*) from usertable ;
count
-------
    3


On the cassandra output (-f) I can see that there is some handoff-ing
active, which I did not expect.

Has anyone an idea why the handoff is active while issuing opperations
with ConsistencyLevel.ALL?
Why is the truncate not correctly put in sync and allows subsequent
handoff's delivered of records originally set before the truncate?

Thanks if you can clarify these thing, I did not expect this at all.

Cheers,

Peter

--
Peter Dijkshoorn
Adyen - Payments Made Easy
www.adyen.com

Visiting address:            Mail Address:             
Simon Carmiggeltstraat 6-50     P.O. Box 10095
1011 DJ Amsterdam               1001 EB Amsterdam
The Netherlands                 The Netherlands

Office +31.20.240.1240
Email peter.dijkshoorn@adyen.com