From user-return-30519-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Sat Dec 8 18:19:05 2012 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0240ED3CF for ; Sat, 8 Dec 2012 18:19:05 +0000 (UTC) Received: (qmail 53695 invoked by uid 500); 8 Dec 2012 18:19:02 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 53601 invoked by uid 500); 8 Dec 2012 18:19:01 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 53588 invoked by uid 99); 8 Dec 2012 18:19:01 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 08 Dec 2012 18:19:01 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of 0x6e6562@gmail.com designates 209.85.212.170 as permitted sender) Received: from [209.85.212.170] (HELO mail-wi0-f170.google.com) (209.85.212.170) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 08 Dec 2012 18:18:51 +0000 Received: by mail-wi0-f170.google.com with SMTP id hq7so325308wib.1 for ; Sat, 08 Dec 2012 10:18:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=references:mime-version:in-reply-to:content-type :content-transfer-encoding:message-id:cc:x-mailer:from:subject:date :to; bh=79dAJvYAxXZNyevgn7J+ZpqZQx8IvxUpabqrloEBJT8=; b=VY1MW8SLnLNAYYWHYyXPOUoLx2EROUjvJcp4SummGgAfoEqvXOXJxp+Ea4OntGS/1T ZSVNVgx5K+m/oa0S8YIk3IuVygQSKDPnuMBYsPP0u6MpRUK+VAYuyIfFh3TNpQZoVimw RQUg62vaT14VqL8dvnMrnHSbOFzu5wW1KIIoGxebTV7kfR7oFQkrjQtUJbqTfri/ITOk bw7EH9vtC9kAwOnDwREGEFxVgyRt6I6wPcu4pYqfzpzc/SXbuw0C8gG0fY307gvwx8vr fcsRFtlj1kb99xQKxPSL7tmJPKCyb+aCnl/BUBYc8AKZDRtDU3ymcmB86pESYJM20XSj 9azg== Received: by 10.216.206.84 with SMTP id k62mr3693029weo.156.1354990711668; Sat, 08 Dec 2012 10:18:31 -0800 (PST) Received: from [192.168.1.210] (host86-166-89-182.range86-166.btcentralplus.com. [86.166.89.182]) by mx.google.com with ESMTPS id o3sm3124066wic.0.2012.12.08.10.18.29 (version=SSLv3 cipher=OTHER); Sat, 08 Dec 2012 10:18:30 -0800 (PST) References: Mime-Version: 1.0 (1.0) In-Reply-To: Content-Type: multipart/alternative; boundary=Apple-Mail-24AAFC2D-7AF5-498E-8EA3-282A2FB35DC0 Content-Transfer-Encoding: 7bit Message-Id: <2D6EBD5C-E762-44F6-B12A-F170D4FCB658@gmail.com> Cc: "user@cassandra.apache.org" X-Mailer: iPhone Mail (10A523) From: Ben Hood <0x6e6562@gmail.com> Subject: Re: Batch mutation streaming Date: Sat, 8 Dec 2012 18:18:29 +0000 To: "user@cassandra.apache.org" X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-24AAFC2D-7AF5-498E-8EA3-282A2FB35DC0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Thanks for the clarification Andrey. If that is the case, I had better ensur= e that I don't put the entire contents of a very long input stream into a si= ngle batch, since that is presumably going to cause a very large message to a= ccumulate on the client side (and if the message is being decoded on the ser= ver site as a complete message, then presumably the same resident memory con= sumption applies there too). Cheers, Ben On Dec 7, 2012, at 17:24, Andrey Ilinykh wrote: > Cassandra uses thrift messages to pass data to and from server. A batch is= just a convenient way to create such message. Nothing happens until you sen= d this message. Probably, this is what you call "close the batch". >=20 > Thank you, > Andrey >=20 >=20 > On Fri, Dec 7, 2012 at 5:34 AM, Ben Hood <0x6e6562@gmail.com> wrote: >> Hi, >>=20 >> I'd like my app to stream a large number of events into Cassandra that or= iginate from the same network input stream. If I create one batch mutation, c= an I just keep appending events to the Cassandra batch until I'm done, or ar= e there some practical considerations about doing this (e.g. too much stuff b= uffering up on the client or server side, visibility of the data within the b= atch that hasn't been closed by the client yet)? Barring any discussion abou= t atomicity, if I were able to stream a largish source into Cassandra, what w= ould happen if the client crashed and didn't close the batch? Or is this kin= d of thing just a normal occurrence that Cassandra has to be aware of anyway= ? >>=20 >> Cheers, >>=20 >> Ben >=20 --Apple-Mail-24AAFC2D-7AF5-498E-8EA3-282A2FB35DC0 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 7bit
Thanks for the clarification Andrey. If that is the case, I had better ensure that I don't put the entire contents of a very long input stream into a single batch, since that is presumably going to cause a very large message to accumulate on the client side (and if the message is being decoded on the server site as a complete message, then presumably the same resident memory consumption applies there too).

Cheers,


Ben

On Dec 7, 2012, at 17:24, Andrey Ilinykh <ailinykh@gmail.com> wrote:

Cassandra uses thrift messages to pass data to and from server. A batch is just a convenient way to create such message. Nothing happens until you send this message. Probably, this is what you call "close the batch".

Thank you,
  Andrey


On Fri, Dec 7, 2012 at 5:34 AM, Ben Hood <0x6e6562@gmail.com> wrote:
Hi,

I'd like my app to stream a large number of events into Cassandra that originate from the same network input stream. If I create one batch mutation, can I just keep appending events to the Cassandra batch until I'm done, or are there some practical considerations about doing this (e.g. too much stuff buffering up on the client or server side, visibility of the data within the batch that hasn't been closed by the client yet)? Barring any discussion about atomicity, if I were able to stream a largish source into Cassandra, what would happen if the client crashed and didn't close the batch? Or is this kind of thing just a normal occurrence that Cassandra has to be aware of anyway?

Cheers,

Ben

--Apple-Mail-24AAFC2D-7AF5-498E-8EA3-282A2FB35DC0--