Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of doanduyhai@gmail.com designates
 209.85.214.178 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CA+BDQ7yW9e8nEQoAL9+f7LWn1QgnvUPHuToT8QqZig4uq+U=dA@mail.gmail.com>
References: 
 <CAGA++nkTyQXKCsBe=Dzjc8Cy4TpWwJc0ARY-0nLC0iqXdR0jfQ@mail.gmail.com>
	<CA+BDQ7yW9e8nEQoAL9+f7LWn1QgnvUPHuToT8QqZig4uq+U=dA@mail.gmail.com>
Date: Wed, 19 Feb 2014 16:47:28 +0100
Message-ID: 
 <CABNXB2Di4__0S=WNJw0YWTrudm8e4icG5vUH6diyEhVTh=5SKw@mail.gmail.com>
Subject: Re: Performance problem with large wide row inserts using CQL
From: DuyHai Doan <doanduyhai@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=001a11c306005167de04f2c44e20

--001a11c306005167de04f2c44e20
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Agree with John

 Preparing a statement follows this process:

 1) send the statement to the server
 2) statement validation on server side
 3) if validation is ok, the C* node will assign an UUID to this prepared
statement
 4) send back the UUID to the java driver core

 Now, you can re-use this same prepared statement millions of time with
BoundStatement bs =3D preparedStatement.bind(values ....)

 Please note that there will be a maximum of 100 000 prepared statements
retained per node


On Wed, Feb 19, 2014 at 3:57 PM, John Sanda <john.sanda@gmail.com> wrote:

> From a quick glance at your code, it looks like you are preparing your
> insert statement multiple times. You only need to prepare it once. I woul=
d
> expect to see some improvement with that change.
>
>
> On Wed, Feb 19, 2014 at 5:27 AM, R=FCdiger Klaehn <rklaehn@gmail.com> wro=
te:
>
>> Hi all,
>>
>> I am evaluating Cassandra for satellite telemetry storage and analysis. =
I
>> set up a little three node cluster on my local development machine and
>> wrote a few simple test programs.
>>
>>  My use case requires storing incoming telemetry updates in the database
>> at the same rate as they are coming in. A telemetry update is a map of
>> name/value pairs that arrives at a certain time.
>>
>> The idea is that I want to store the data as quickly as possible, and
>> then later store it in an additional format that is more amenable to
>> analysis.
>>
>> The format I have chosen for my test is the following:
>>
>> CREATE TABLE IF NOT EXISTS test.wide (
>>   time varchar,
>>   name varchar,
>>   value varchar,
>>   PRIMARY KEY (time,name))
>>   WITH COMPACT STORAGE
>>
>> The layout I want to achieve with this is something like this:
>>
>> +-------+-------+-------+-------+-------+-------+
>> |       | name1 | name2 | name3 | ...   | nameN |
>> | time  +-------+-------+-------+-------+-------+
>> |       | val1  | val2  | val3  | ...   | valN  |
>> +-------+-------+-------+-------|-------+-------+
>>
>> (Time will at some point be some kind of timestamp, and value will becom=
e
>> a blob. But this is just for initial testing)
>>
>> The problem is the following: I am getting very low performance for bulk
>> inserts into the above table. In my test program, each insert has a new,
>> unique time and creates a row with 10000 name/value pairs. This should m=
ap
>> into creating a new row in the underlying storage engine, correct? I do
>> that 1000 times and measure both time per insert and total time.
>>
>> I am getting about 0.5s for each insert of 10000 name/value pairs, which
>> is much lower than the rate at which the telemetry is coming in at my
>> system. I have read a few previous threads on this subject and am using
>> batch prepared statements for maximum performance (
>> https://issues.apache.org/jira/browse/CASSANDRA-4693 ). But that does
>> not help.
>>
>> Here is the CQL benchmark:
>> https://gist.github.com/rklaehn/9089304#file-cassandratestminimized-scal=
a
>>
>> I have written the exact same thing using the thrift API of astyanax, an=
d
>> I am getting much better performance. Each insert of 10000 name/values
>> takes 0.04s using a ColumnListMutation. When I use async calls for both
>> programs, as suggested by somebody on Stackoverflow, the difference gets
>> even larger. The CQL insert remains at 0.5s per insert on average, where=
as
>> the astyanax ColumnListMutation approach takes 0.01s per insert on
>> average, even on my test cluster. That's the kind of performance I need.
>>
>> Here is the thrift benchmark, modified from an ast example:
>> https://gist.github.com/rklaehn/9089304#file-astclient-java
>>
>> I realize that running on a test cluster on localhost is not a 100%
>> realistic test. But nevertheless you would expect both tests to have
>> roughly similar performance.
>>
>> I saw a few suggestions to create a table with CQL and fill it using the
>> thrift API. For example in this thread
>> http://mail-archives.apache.org/mod_mbox/cassandra-user/201309.mbox/%3C5=
23334B8.8070802@gmail.com%3E. But I would very much prefer to use pure CQL =
for this. It seems that the
>> thrift API is considered deprecated, so I would not feel comfortable
>> starting a new project using a legacy API.
>>
>> I already posted a question on SO about this, but did not get any
>> satisfactory answer. Just general performance tuning tips that do nothin=
g
>> to explain the difference between the CQL and thrift approaches.
>>
>> http://stackoverflow.com/questions/21778671/cassandra-how-to-insert-a-ne=
w-wide-row-with-good-performance-using-cql
>>
>> Am I doing something wrong, or is this a fundamental limitation of CQL.
>> If the latter is the case, what's the plan to mitigate the issue?
>>
>> There is a JIRA issue about this (
>> https://issues.apache.org/jira/browse/CASSANDRA-5959 ), but it is marked
>> as a duplicate of https://issues.apache.org/jira/browse/CASSANDRA-4693 .
>> But according to my benchmarks batch prepared statements do not solve th=
is
>> issue!
>>
>> I would really appreciate any help on this issue. The telemetry data I
>> would like to import into C* for testing contains ~2*10^12 samples, wher=
e
>> each sample consists of time, value and status. If quick batch insertion=
 is
>> not possible, I would not even be able to insert it in an acceptable tim=
e.
>>
>> best regards,
>>
>> R=FCdiger
>>
>
>
>
> --
>
> - John
>

--001a11c306005167de04f2c44e20
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Agree with John<div><br></div><div style>=A0Preparing a st=
atement follows this process:=A0</div><div style><br></div><div style>=A01)=
 send the statement to the server</div><div style>=A02) statement validatio=
n on server side</div>
<div style>=A03) if validation is ok, the C* node will assign an UUID to th=
is prepared statement</div><div style>=A04) send back the UUID to the java =
driver core</div><div style><br></div><div style>=A0Now, you can re-use thi=
s same prepared statement millions of time with BoundStatement bs =3D prepa=
redStatement.bind(values ....)</div>
<div style><br></div><div style>=A0Please note that there will be a maximum=
 of 100 000 prepared statements retained per node</div></div><div class=3D"=
gmail_extra"><br><br><div class=3D"gmail_quote">On Wed, Feb 19, 2014 at 3:5=
7 PM, John Sanda <span dir=3D"ltr">&lt;<a href=3D"mailto:john.sanda@gmail.c=
om" target=3D"_blank">john.sanda@gmail.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">From a quick glance at your=
 code, it looks like you are preparing your insert statement multiple times=
. You only need to prepare it once. I would expect to see some improvement =
with that change.</div>
<div class=3D"gmail_extra"><div><div class=3D"h5">
<br><br><div class=3D"gmail_quote">On Wed, Feb 19, 2014 at 5:27 AM, R=FCdig=
er Klaehn <span dir=3D"ltr">&lt;<a href=3D"mailto:rklaehn@gmail.com" target=
=3D"_blank">rklaehn@gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D=
"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding=
-left:1ex">

<div dir=3D"ltr">Hi all,<br><div><br>I am evaluating Cassandra for satellit=
e telemetry storage and analysis. I set up a little three node cluster on m=
y local development machine and wrote a few simple test programs.<br><br>

</div>
<div>My use case requires storing incoming telemetry updates in the databas=
e at the same rate as they are coming in. A telemetry update is a map of na=
me/value pairs that arrives at a certain time.<br><br>The idea is that I wa=
nt to store the data as quickly as possible, and then later store it in an =
additional format that is more amenable to analysis.<br>


</div><div><br>The format I have chosen for my test is the following:<br><p=
re><code>CREATE TABLE IF NOT EXISTS test.wide (
  time varchar,
  name varchar,
  value varchar,
  PRIMARY KEY (time,name))
  WITH COMPACT STORAGE<br></code></pre>The layout I want to achieve with th=
is is something like this:<br><pre><code>+-------+-------+-------+-------+-=
------+-------+
|       | name1 | name2 | name3 | ...   | nameN |
| time  +-------+-------+-------+-------+-------+
|       | val1  | val2  | val3  | ...   | valN  |
+-------+-------+-------+-------|-------+-------+</code></pre>(Time will at=
 some point be some kind of timestamp, and value will become a blob. But th=
is is just for initial testing)<br><br>The problem is the following: I am g=
etting very low performance for bulk inserts into the above table. In my te=
st program, each insert has a new, unique time and creates a row with 10000=
 name/value pairs. This should map into creating a new row in the underlyin=
g storage engine, correct? I do that 1000 times and measure both time per i=
nsert and total time. <br>


<br></div><div>I am getting about 0.5s for each insert of 10000 name/value =
pairs, which is much lower than the rate at which the telemetry is coming i=
n at my system. I have read a few previous threads on this subject and am u=
sing batch prepared statements for maximum performance ( <a href=3D"https:/=
/issues.apache.org/jira/browse/CASSANDRA-4693" target=3D"_blank">https://is=
sues.apache.org/jira/browse/CASSANDRA-4693</a> ). But that does not help.<b=
r>


<br></div><div>Here is the CQL benchmark: <a href=3D"https://gist.github.co=
m/rklaehn/9089304#file-cassandratestminimized-scala" target=3D"_blank">http=
s://gist.github.com/rklaehn/9089304#file-cassandratestminimized-scala</a><b=
r>

</div><div><br>
</div><div>I have written the exact same thing using the thrift API of asty=
anax, and I am getting much better performance. Each insert of 10000 name/v=
alues takes 0.04s using a <span>ColumnListMutation</span>. When I use async=
 calls for both programs, as suggested by somebody on Stackoverflow, the di=
fference gets even larger. The CQL insert remains at 0.5s per insert on ave=
rage, whereas the astyanax <span>ColumnListMutation approach takes 0.01s</s=
pan> per insert on average, even on my test cluster. That&#39;s the kind of=
 performance I need.<br>


<br></div><div>Here is the thrift benchmark, modified from an ast example: =
<a href=3D"https://gist.github.com/rklaehn/9089304#file-astclient-java" tar=
get=3D"_blank">https://gist.github.com/rklaehn/9089304#file-astclient-java<=
/a><br>

</div><div>
<br></div><div>I realize that running on a test cluster on localhost is not=
 a 100% realistic test. But nevertheless you would expect both tests to hav=
e roughly similar performance. <br></div><div><br> I saw a few suggestions =
to create a table with CQL and fill it using the thrift API. For example in=
 this thread <a href=3D"http://mail-archives.apache.org/mod_mbox/cassandra-=
user/201309.mbox/%3C523334B8.8070802@gmail.com%3E" target=3D"_blank">http:/=
/mail-archives.apache.org/mod_mbox/cassandra-user/201309.mbox/%3C523334B8.8=
070802@gmail.com%3E</a> . But I would very much prefer to use pure CQL for =
this. It seems that the thrift API is considered deprecated, so I would not=
 feel comfortable starting a new project using a legacy API.<br>


<br></div><div>I already posted a question on SO about this, but did not ge=
t any satisfactory answer. Just general performance tuning tips that do not=
hing to explain the difference between the CQL and thrift approaches.<br>


<a href=3D"http://stackoverflow.com/questions/21778671/cassandra-how-to-ins=
ert-a-new-wide-row-with-good-performance-using-cql" target=3D"_blank">http:=
//stackoverflow.com/questions/21778671/cassandra-how-to-insert-a-new-wide-r=
ow-with-good-performance-using-cql</a><br>


</div><div><br></div><div>Am I doing something wrong, or is this a fundamen=
tal limitation of CQL. If the latter is the case, what&#39;s the plan to mi=
tigate the issue?<br><br></div><div>There is a JIRA issue about this ( <a h=
ref=3D"https://issues.apache.org/jira/browse/CASSANDRA-5959" target=3D"_bla=
nk">https://issues.apache.org/jira/browse/CASSANDRA-5959</a> ), but it is m=
arked as a duplicate of <a href=3D"https://issues.apache.org/jira/browse/CA=
SSANDRA-4693" target=3D"_blank">https://issues.apache.org/jira/browse/CASSA=
NDRA-4693</a> . But according to my benchmarks batch prepared statements do=
 not solve this issue!<br>


<br></div><div>I would really appreciate any help on this issue. The teleme=
try data I would like to import into C* for testing contains ~2*10^12 sampl=
es, where each sample consists of time, value and status. If quick batch in=
sertion is not possible, I would not even be able to insert it in an accept=
able time.<br>


</div><div><br></div><div>best regards,<br><br>R=FCdiger<br></div></div>
</blockquote></div><br><br clear=3D"all"><div><br></div></div></div><span c=
lass=3D"HOEnZb"><font color=3D"#888888">-- <br><br>- John
</font></span></div>
</blockquote></div><br></div>

--001a11c306005167de04f2c44e20--