Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A4B50173B0 for ; Tue, 9 Jun 2015 09:20:42 +0000 (UTC) Received: (qmail 9181 invoked by uid 500); 9 Jun 2015 09:20:38 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 9135 invoked by uid 500); 9 Jun 2015 09:20:38 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 9125 invoked by uid 99); 9 Jun 2015 09:20:38 -0000 Received: from Unknown (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Jun 2015 09:20:38 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 28C78C095F for ; Tue, 9 Jun 2015 09:20:38 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.9 X-Spam-Level: ** X-Spam-Status: No, score=2.9 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id LJ0vwdYa5W1F for ; Tue, 9 Jun 2015 09:20:27 +0000 (UTC) Received: from mail-ig0-f169.google.com (mail-ig0-f169.google.com [209.85.213.169]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 1B1F927623 for ; Tue, 9 Jun 2015 09:20:27 +0000 (UTC) Received: by igbhj9 with SMTP id hj9so8456271igb.1 for ; Tue, 09 Jun 2015 02:19:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=5dJMDPJ395fIo436ih/KHUIUr9p44Lg0wFIAPDc3e1Q=; b=wGHFwIggXxUEvYvQ06h0cO7txAMkoEaHZENa2Q9v8+CeO0vTf3wNHQth6ZOMCg0Sns 433KzGy4QQ12j8qqZ8E98pIwJHbXkjtVEXUQsQCZxIxQFs2567T7xPVz5+HvR9RowjE7 yPT4UhFiX6TQFkPgsrSZv6uzWrGRSbScBavLdyvwec/3n9vUVCrECvYT3eD3Cgd/KgyG P2s6gbbZri7B12YmQxfaxoTetDUMX3x+0+JeuquHXBd9inRN/7dHMWyoPKcv1sm/UDOE fqPSa73x/Skq4sN556V40t+yumE8u04/3FqHtI3dFhSy5bdKIv9VphVeSsoyBKTqAd0b HhLg== MIME-Version: 1.0 X-Received: by 10.42.176.8 with SMTP id bc8mr27958037icb.22.1433841574933; Tue, 09 Jun 2015 02:19:34 -0700 (PDT) Received: by 10.107.172.198 with HTTP; Tue, 9 Jun 2015 02:19:34 -0700 (PDT) In-Reply-To: <5576AA26.3070007@ericsson.com> References: <5576AA26.3070007@ericsson.com> Date: Tue, 9 Jun 2015 02:19:34 -0700 Message-ID: Subject: Re: Cassandra Insert Rate From: SHANKAR REDDY To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=90e6ba6e8642bcfb7b0518124152 --90e6ba6e8642bcfb7b0518124152 Content-Type: text/plain; charset=UTF-8 Thanks for BR for the quick response on this and Appreciate it. That helps for Batch load. If 10 million users are inserting new records at a time ( 1 record for one user ) then how do we increase the same. My sample program assumes request from 10 million records. -Shankar On Tue, Jun 9, 2015 at 1:56 AM, Marcus Olsson wrote: > Hi Shankar, > > I would say: > * Prepared statements to avoid sending the whole statement with every > query and instead just send the values. > * Using session.executeAsync() to improve concurrency. > > So you would start by creating a prepared statement, something like: > > PreparedStatement ps = session.prepare("INSERT INTO ks.tb > (key1,data1,data2) VALUES (?,?,?)"); // Only done once > > And then in loadData(): > session.executeAsync(ps.bind("key", "1", "2")); > > The executeAsync() does not wait for a response for the query, so that > should probably be done elsewhere(see the link below for how you can get > the results back). > > http://www.datastax.com/dev/blog/java-driver-async-queries > > BR > Marcus Olsson > > > On 06/09/2015 10:27 AM, SHANKAR REDDY wrote: > > Team, > I have a sample insert query which loads around 10 million records and > found that the insert rate is around 1500 per second. This is very slow. > > The Source code I am using available at the below location. I am using > the very latest version 2.1.6 with default seetings and single node VM > machine with 20GM RAM and 100 GM SSD disk. > > > https://github.com/shankar-reddy/CassandraSandbox/blob/master/src/main/java/com/itreddys/cassandra/example/BulkLoadTest.java > > Please suggest on insert rate improvement. > > -Shankar > > > --90e6ba6e8642bcfb7b0518124152 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Thanks for BR for the quick response on this and Appr= eciate it.

That helps for Batch load.
If 10 million users are inserting new records at a time ( 1 re= cord for one user ) then how =C2=A0do we increase the same. My sample progr= am assumes request from 10 million records.

-Shankar


On Tue, Jun 9, 2015 at 1:56 AM, Marcus Olsso= n <marcus.olsson@ericsson.com> wrote:
=20 =20 =20
Hi Shankar,

I would say:
* Prepared statements to avoid sending the whole statement with every query and instead just send the values.
* Using session.executeAsync() to improve concurrency.

So you would start by creating a prepared statement, something like:
PreparedStatement ps =3D session.prepare("INSERT INTO ks.tb (key1,data1,data2) VALUES (?,?,?)"); // Only done once

And then in loadData():
session.executeAsync(ps.bind("key", "1", "2&qu= ot;));

The executeAsync() does not wait for a response for the query, so that should probably be done elsewhere(see the link below for how you can get the results back).

http://www.datastax.com/dev/blog/java-driver-async-querie= s

BR
Marcus Olsson


On 06/09/2015 10:27 AM, SHANKAR REDDY wrote:
=20
Team,
I have a sample insert query which loads around 10 million records and found that the insert rate is around 1500 per second.=C2=A0 This is very slow.

The Source code I am using available at the below location. I am using the very latest version 2.1.6 with default seetings =C2=A0and single node VM machine with 20GM RAM and 100 GM SSD disk.=C2=A0



--90e6ba6e8642bcfb7b0518124152--