Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AB75610F57 for ; Wed, 19 Feb 2014 15:47:57 +0000 (UTC) Received: (qmail 10362 invoked by uid 500); 19 Feb 2014 15:47:54 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 10335 invoked by uid 500); 19 Feb 2014 15:47:54 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 10327 invoked by uid 99); 19 Feb 2014 15:47:54 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Feb 2014 15:47:54 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of doanduyhai@gmail.com designates 209.85.214.178 as permitted sender) Received: from [209.85.214.178] (HELO mail-ob0-f178.google.com) (209.85.214.178) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Feb 2014 15:47:49 +0000 Received: by mail-ob0-f178.google.com with SMTP id wn1so601034obc.9 for ; Wed, 19 Feb 2014 07:47:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=NW+d+x3bsYmVgzxuKmx0DPxqbiTEGmIBqWbMqkJj1uY=; b=xiSpEDXhol3waouhAFBeq+bJ6Z8SE3bpZa6A/CbJ8IpNMz7TwxTJf+PPofzYi1WkeX ISm2o9z/BnYmga0yWCUP4jXtXgOYN733XT+pRUE5LtK2Z10v329fSGLc5QjsE/y2IZvy 1zBVm/2IL29gtVm9Jlxj55/3zuAgDKuJlVsYWuREZ0xolKuvj+G086mDVMgrt/ZQxnyD oPOeI5KG0fbqYLLmjLeIl2Y6XaT4e63p2QlevNaQ74bxfoteyIzU7b/+xoiSShrMYD6G 0Rfh7Qm/dzVqZgZuAmcV6r2jhVKktOsliz6bf+qj+MgrqiHpXmUtQkUBeWtJYw+YoEmJ 0DFQ== MIME-Version: 1.0 X-Received: by 10.182.233.228 with SMTP id tz4mr2558520obc.56.1392824848303; Wed, 19 Feb 2014 07:47:28 -0800 (PST) Received: by 10.76.122.48 with HTTP; Wed, 19 Feb 2014 07:47:28 -0800 (PST) In-Reply-To: References: Date: Wed, 19 Feb 2014 16:47:28 +0100 Message-ID: Subject: Re: Performance problem with large wide row inserts using CQL From: DuyHai Doan To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=001a11c306005167de04f2c44e20 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c306005167de04f2c44e20 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Agree with John Preparing a statement follows this process: 1) send the statement to the server 2) statement validation on server side 3) if validation is ok, the C* node will assign an UUID to this prepared statement 4) send back the UUID to the java driver core Now, you can re-use this same prepared statement millions of time with BoundStatement bs =3D preparedStatement.bind(values ....) Please note that there will be a maximum of 100 000 prepared statements retained per node On Wed, Feb 19, 2014 at 3:57 PM, John Sanda wrote: > From a quick glance at your code, it looks like you are preparing your > insert statement multiple times. You only need to prepare it once. I woul= d > expect to see some improvement with that change. > > > On Wed, Feb 19, 2014 at 5:27 AM, R=FCdiger Klaehn wro= te: > >> Hi all, >> >> I am evaluating Cassandra for satellite telemetry storage and analysis. = I >> set up a little three node cluster on my local development machine and >> wrote a few simple test programs. >> >> My use case requires storing incoming telemetry updates in the database >> at the same rate as they are coming in. A telemetry update is a map of >> name/value pairs that arrives at a certain time. >> >> The idea is that I want to store the data as quickly as possible, and >> then later store it in an additional format that is more amenable to >> analysis. >> >> The format I have chosen for my test is the following: >> >> CREATE TABLE IF NOT EXISTS test.wide ( >> time varchar, >> name varchar, >> value varchar, >> PRIMARY KEY (time,name)) >> WITH COMPACT STORAGE >> >> The layout I want to achieve with this is something like this: >> >> +-------+-------+-------+-------+-------+-------+ >> | | name1 | name2 | name3 | ... | nameN | >> | time +-------+-------+-------+-------+-------+ >> | | val1 | val2 | val3 | ... | valN | >> +-------+-------+-------+-------|-------+-------+ >> >> (Time will at some point be some kind of timestamp, and value will becom= e >> a blob. But this is just for initial testing) >> >> The problem is the following: I am getting very low performance for bulk >> inserts into the above table. In my test program, each insert has a new, >> unique time and creates a row with 10000 name/value pairs. This should m= ap >> into creating a new row in the underlying storage engine, correct? I do >> that 1000 times and measure both time per insert and total time. >> >> I am getting about 0.5s for each insert of 10000 name/value pairs, which >> is much lower than the rate at which the telemetry is coming in at my >> system. I have read a few previous threads on this subject and am using >> batch prepared statements for maximum performance ( >> https://issues.apache.org/jira/browse/CASSANDRA-4693 ). But that does >> not help. >> >> Here is the CQL benchmark: >> https://gist.github.com/rklaehn/9089304#file-cassandratestminimized-scal= a >> >> I have written the exact same thing using the thrift API of astyanax, an= d >> I am getting much better performance. Each insert of 10000 name/values >> takes 0.04s using a ColumnListMutation. When I use async calls for both >> programs, as suggested by somebody on Stackoverflow, the difference gets >> even larger. The CQL insert remains at 0.5s per insert on average, where= as >> the astyanax ColumnListMutation approach takes 0.01s per insert on >> average, even on my test cluster. That's the kind of performance I need. >> >> Here is the thrift benchmark, modified from an ast example: >> https://gist.github.com/rklaehn/9089304#file-astclient-java >> >> I realize that running on a test cluster on localhost is not a 100% >> realistic test. But nevertheless you would expect both tests to have >> roughly similar performance. >> >> I saw a few suggestions to create a table with CQL and fill it using the >> thrift API. For example in this thread >> http://mail-archives.apache.org/mod_mbox/cassandra-user/201309.mbox/%3C5= 23334B8.8070802@gmail.com%3E. But I would very much prefer to use pure CQL = for this. It seems that the >> thrift API is considered deprecated, so I would not feel comfortable >> starting a new project using a legacy API. >> >> I already posted a question on SO about this, but did not get any >> satisfactory answer. Just general performance tuning tips that do nothin= g >> to explain the difference between the CQL and thrift approaches. >> >> http://stackoverflow.com/questions/21778671/cassandra-how-to-insert-a-ne= w-wide-row-with-good-performance-using-cql >> >> Am I doing something wrong, or is this a fundamental limitation of CQL. >> If the latter is the case, what's the plan to mitigate the issue? >> >> There is a JIRA issue about this ( >> https://issues.apache.org/jira/browse/CASSANDRA-5959 ), but it is marked >> as a duplicate of https://issues.apache.org/jira/browse/CASSANDRA-4693 . >> But according to my benchmarks batch prepared statements do not solve th= is >> issue! >> >> I would really appreciate any help on this issue. The telemetry data I >> would like to import into C* for testing contains ~2*10^12 samples, wher= e >> each sample consists of time, value and status. If quick batch insertion= is >> not possible, I would not even be able to insert it in an acceptable tim= e. >> >> best regards, >> >> R=FCdiger >> > > > > -- > > - John > --001a11c306005167de04f2c44e20 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Agree with John

=A0Preparing a st= atement follows this process:=A0

=A01)= send the statement to the server
=A02) statement validatio= n on server side
=A03) if validation is ok, the C* node will assign an UUID to th= is prepared statement
=A04) send back the UUID to the java = driver core

=A0Now, you can re-use thi= s same prepared statement millions of time with BoundStatement bs =3D prepa= redStatement.bind(values ....)

=A0Please note that there will be a maximum= of 100 000 prepared statements retained per node


On Wed, Feb 19, 2014 at 3:5= 7 PM, John Sanda <john.sanda@gmail.com> wrote:
From a quick glance at your= code, it looks like you are preparing your insert statement multiple times= . You only need to prepare it once. I would expect to see some improvement = with that change.


On Wed, Feb 19, 2014 at 5:27 AM, R=FCdig= er Klaehn <rklaehn@gmail.com> wrote:
Hi all,

I am evaluating Cassandra for satellit= e telemetry storage and analysis. I set up a little three node cluster on m= y local development machine and wrote a few simple test programs.

My use case requires storing incoming telemetry updates in the databas= e at the same rate as they are coming in. A telemetry update is a map of na= me/value pairs that arrives at a certain time.

The idea is that I wa= nt to store the data as quickly as possible, and then later store it in an = additional format that is more amenable to analysis.

The format I have chosen for my test is the following:
CREATE TABLE IF NOT EXISTS test.wide ( time varchar, name varchar, value varchar, PRIMARY KEY (time,name)) WITH COMPACT STORAGE
The layout I want to achieve with th= is is something like this:
+-------+-------+-------+-------+-=
------+-------+
|       | name1 | name2 | name3 | ...   | nameN |
| time  +-------+-------+-------+-------+-------+
|       | val1  | val2  | val3  | ...   | valN  |
+-------+-------+-------+-------|-------+-------+
(Time will at= some point be some kind of timestamp, and value will become a blob. But th= is is just for initial testing)

The problem is the following: I am g= etting very low performance for bulk inserts into the above table. In my te= st program, each insert has a new, unique time and creates a row with 10000= name/value pairs. This should map into creating a new row in the underlyin= g storage engine, correct? I do that 1000 times and measure both time per i= nsert and total time.

I am getting about 0.5s for each insert of 10000 name/value = pairs, which is much lower than the rate at which the telemetry is coming i= n at my system. I have read a few previous threads on this subject and am u= sing batch prepared statements for maximum performance ( https://is= sues.apache.org/jira/browse/CASSANDRA-4693 ). But that does not help.

I have written the exact same thing using the thrift API of asty= anax, and I am getting much better performance. Each insert of 10000 name/v= alues takes 0.04s using a ColumnListMutation. When I use async= calls for both programs, as suggested by somebody on Stackoverflow, the di= fference gets even larger. The CQL insert remains at 0.5s per insert on ave= rage, whereas the astyanax ColumnListMutation approach takes 0.01s per insert on average, even on my test cluster. That's the kind of= performance I need.

Here is the thrift benchmark, modified from an ast example: = https://gist.github.com/rklaehn/9089304#file-astclient-java<= /a>

I saw a few suggestions = to create a table with CQL and fill it using the thrift API. For example in= this thread
http:/= /mail-archives.apache.org/mod_mbox/cassandra-user/201309.mbox/%3C523334B8.8= 070802@gmail.com%3E . But I would very much prefer to use pure CQL for = this. It seems that the thrift API is considered deprecated, so I would not= feel comfortable starting a new project using a legacy API.

I already posted a question on SO about this, but did not ge= t any satisfactory answer. Just general performance tuning tips that do not= hing to explain the difference between the CQL and thrift approaches.
http:= //stackoverflow.com/questions/21778671/cassandra-how-to-insert-a-new-wide-r= ow-with-good-performance-using-cql

Am I doing something wrong, or is this a fundamen= tal limitation of CQL. If the latter is the case, what's the plan to mi= tigate the issue?

There is a JIRA issue about this ( https://issues.apache.org/jira/browse/CASSANDRA-5959 ), but it is m= arked as a duplicate of https://issues.apache.org/jira/browse/CASSA= NDRA-4693 . But according to my benchmarks batch prepared statements do= not solve this issue!

I would really appreciate any help on this issue. The teleme= try data I would like to import into C* for testing contains ~2*10^12 sampl= es, where each sample consists of time, value and status. If quick batch in= sertion is not possible, I would not even be able to insert it in an accept= able time.

best regards,

R=FCdiger



--

- John

--001a11c306005167de04f2c44e20--