Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C4699106EE for ; Wed, 11 Dec 2013 12:41:26 +0000 (UTC) Received: (qmail 19822 invoked by uid 500); 11 Dec 2013 12:41:20 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 19223 invoked by uid 500); 11 Dec 2013 12:41:14 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 19211 invoked by uid 99); 11 Dec 2013 12:41:13 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Dec 2013 12:41:13 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of rwille@fold3.com designates 38.101.149.73 as permitted sender) Received: from [38.101.149.73] (HELO mx02.iarchives.com) (38.101.149.73) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Dec 2013 12:41:09 +0000 Received: from mx02.iarchives.com (localhost [127.0.0.1]) by mx02.iarchives.com (Postfix) with ESMTP id 8A97AC1108 for ; Wed, 11 Dec 2013 05:40:48 -0700 (MST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=fold3.com; h=date :subject:from:to:message-id:in-reply-to:mime-version :content-type:content-transfer-encoding; s=m1; bh=atwOCa3onueqoS xDEcuOZ7HkUpntRfxP6E1YGKCpb/c=; b=B9ICwIn9I3TDlJrSlFYgaJW9Bxg8K6 SClUaahRbOXcEXhJUzYcjqZ8CSdoZ5JofhRJ/Lg9KJQSzbhxFGB6Y0OjY3cbZ77A qaC/Jx0Q794Uzig3zt+HfDxzcg4XqissgeyiKNI1c9YxDGkRwb6n9AB9RchMO/ag 7w812xfDgSJLA= Received: from PANDORA.iarchives.com (pandora.iarchives.com [192.168.100.88]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx02.iarchives.com (Postfix) with ESMTPS id 65567C10AB for ; Wed, 11 Dec 2013 05:40:48 -0700 (MST) Received: from [10.88.88.10] (192.168.97.117) by PANDORA.iarchives.com (192.168.100.88) with Microsoft SMTP Server (TLS) id 14.1.438.0; Wed, 11 Dec 2013 05:40:58 -0700 User-Agent: Microsoft-MacOutlook/14.3.6.130613 Date: Wed, 11 Dec 2013 05:40:43 -0700 Subject: Re: What is the fastest way to get data into Cassandra 2 from a Java application? From: Robert Wille To: Message-ID: Thread-Topic: What is the fastest way to get data into Cassandra 2 from a Java application? In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org I use hand-rolled batches a lot. You can get a *lot* of performance improvement. Just make sure to sanitize your strings. I=B9ve been wondering, what=B9s the limit, practical or hard, on the length of a query? Robert On 12/11/13, 3:37 AM, "David Tinker" wrote: >Yes thats what I found. > >This is faster: > >for (int i =3D 0; i < 1000; i++) session.execute("INSERT INTO >test.wibble (id, info) VALUES ('${"" + i}', '${"aa" + i}')") > >Than this: > >def ps =3D session.prepare("INSERT INTO test.wibble (id, info) VALUES (?, >?)") >for (int i =3D 0; i < 1000; i++) session.execute(ps.bind(["" + i, "aa" + >i] as Object[])) > >This is the fastest option of all (hand rolled batch): > >StringBuilder b =3D new StringBuilder() >b.append("BEGIN UNLOGGED BATCH\n") >for (int i =3D 0; i < 1000; i++) { > b.append("INSERT INTO ").append(ks).append(".wibble (id, info) >VALUES ('").append(i).append("','") > .append("aa").append(i).append("')\n") >} >b.append("APPLY BATCH\n") >session.execute(b.toString()) > > >On Wed, Dec 11, 2013 at 10:56 AM, Sylvain Lebresne >wrote: >> >>> This loop takes 2500ms or so on my test cluster: >>> >>> PreparedStatement ps =3D session.prepare("INSERT INTO perf_test.wibble >>> (id, info) VALUES (?, ?)") >>> for (int i =3D 0; i < 1000; i++) session.execute(ps.bind("" + i, "aa" + >>>i)); >>> >>> The same loop with the parameters inline is about 1300ms. It gets >>> worse if there are many parameters. >> >> >> Do you mean that: >> for (int i =3D 0; i < 1000; i++) >> session.execute("INSERT INTO perf_test.wibble (id, info) VALUES >>(" + i >> + ", aa" + i + ")"); >> is twice as fast as using a prepared statement? And that the difference >> is even greater if you add more columns than "id" and "info"? >> >> That would certainly be unexpected, are you sure you're not >>re-preparing the >> statement every time in the loop? >> >> -- >> Sylvain >> >>> I know I can use batching to >>> insert all the rows at once but thats not the purpose of this test. I >>> also tried using session.execute(cql, params) and it is faster but >>> still doesn't match inline values. >>> >>> Composing CQL strings is certainly convenient and simple but is there >>> a much faster way? >>> >>> Thanks >>> David >>> >>> I have also posted this on Stackoverflow if anyone wants the points: >>> >>>=20 >>>http://stackoverflow.com/questions/20491090/what-is-the-fastest-way-to-g >>>et-data-into-cassandra-2-from-a-java-application >> >> > > > >--=20 >http://qdb.io/ Persistent Message Queues With Replay and #RabbitMQ >Integration