That sounds sensible - thanks for the heads up.
(and if the message is being decoded on the server site as a complete message, then presumably the same resident memory consumption applies there too).
And every row mutation in your batch becomes a task in the Mutation thread pool. If one replica gets 500 row mutations from one client request it will take a while for the (default) 32 threads to chew through them. While this is going on other client request will be effectively blocked.
Depending on the number of clients, I would start with say 50 rows per mutation and keep and eye of the *request* latency.
Hope that helps.
Freelance Cassandra Developer
On 9/12/2012, at 7:18 AM, Ben Hood <firstname.lastname@example.org
Thanks for the clarification Andrey. If that is the case, I had better ensure that I don't put the entire contents of a very long input stream into a single batch, since that is presumably going to cause a very large message to accumulate on the client side (and if the message is being decoded on the server site as a complete message, then presumably the same resident memory consumption applies there too).
Cassandra uses thrift messages to pass data to and from server. A batch is just a convenient way to create such message. Nothing happens until you send this message. Probably, this is what you call "close the batch".