(and if the message is being decoded on the server site as a complete message, then presumably the same resident memory consumption applies there too).
Yerp. 
And every row mutation in your batch becomes a task in the Mutation thread pool. If one replica gets 500 row mutations from one client request it will take a while for the (default) 32 threads to chew through them. While this is going on other client request will be effectively blocked. 

Depending on the number of clients, I would start with say 50 rows per mutation and keep and eye of the *request* latency. 

Hope that helps. 


-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton

On 9/12/2012, at 7:18 AM, Ben Hood <0x6e6562@gmail.com> wrote:

Thanks for the clarification Andrey. If that is the case, I had better ensure that I don't put the entire contents of a very long input stream into a single batch, since that is presumably going to cause a very large message to accumulate on the client side (and if the message is being decoded on the server site as a complete message, then presumably the same resident memory consumption applies there too).

Cheers,


Ben

On Dec 7, 2012, at 17:24, Andrey Ilinykh <ailinykh@gmail.com> wrote:

Cassandra uses thrift messages to pass data to and from server. A batch is just a convenient way to create such message. Nothing happens until you send this message. Probably, this is what you call "close the batch".

Thank you,
  Andrey


On Fri, Dec 7, 2012 at 5:34 AM, Ben Hood <0x6e6562@gmail.com> wrote:
Hi,

I'd like my app to stream a large number of events into Cassandra that originate from the same network input stream. If I create one batch mutation, can I just keep appending events to the Cassandra batch until I'm done, or are there some practical considerations about doing this (e.g. too much stuff buffering up on the client or server side, visibility of the data within the batch that hasn't been closed by the client yet)? Barring any discussion about atomicity, if I were able to stream a largish source into Cassandra, what would happen if the client crashed and didn't close the batch? Or is this kind of thing just a normal occurrence that Cassandra has to be aware of anyway?

Cheers,

Ben