Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: message received from 54.191.145.13
 which is an MX secondary for user@cassandra.apache.org)
MIME-Version: 1.0
Sender: erickramirezonline@gmail.com
In-Reply-To: <1430247364917-7600561.post@n2.nabble.com>
References: <1430247364917-7600561.post@n2.nabble.com>
Date: Mon, 4 May 2015 09:05:54 +1000
Message-ID: 
 <CAB=jy0jm=Fqu7EkDr=+CEbuzC3ZKDM9cOX46ibD0155=Wh2RXg@mail.gmail.com>
Subject: Re: Denormalization leads to terrible, rather than better, Cassandra
 performance -- I am really puzzled
From: Erick Ramirez <erick@ramirez.com.au>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=047d7bdc157ac29fea0515357ce5

--047d7bdc157ac29fea0515357ce5
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Hello, there.

In relation to the Java driver, I would recommend updating to the latest
version as there were a lot of issues reported in versions earlier that
2.0.9 were the driver is incorrectly marking nodes as down/not available.

In fact, there is a new version of the driver being released in the next
24-48 hours that reverts JAVA-425 to resolve this issue.

Cheers,
Erick

*Erick Ramirez*
About Me about.me/erickramirezonline

Make a difference today!
* Reduce your carbon footprint <http://on.mash.to/1vZL7fX>
* Give back to the community <http://www.govolunteer.com.au>
* Write free software <http://www.opensource.org>


On Wed, Apr 29, 2015 at 4:56 AM, dlu66061 <dlu66061@yahoo.com> wrote:

> Cassandra gurus, I am really puzzled by my observations, and hope to get
> some help explaining the results. Thanks in advance.
>
> I think it has always been advocated in Cassandra community that
> de-normalization leads to better performance. I wanted to see how much
> performance improvement it can offer, but the results were totally
> opposite. The performance degraded dramatically for simultaneously reques=
ts
> for the same set of data.
>
> *Environment:*
>
> I have a Cassandra cluster consisting of 3 AWS m3.large instances, with
> Cassandra 2.0.6 installed and pretty much default settings. My program is
> written in Java using Java Driver 2.0.8.
>
> *Normalized case:*
>
> I have two tables created with the following 2 CQL statements
>
> CREATE TABLE event (event_id UUID, time_token timeuuid, =E2=80=A6=C2=AD 3=
0 other
> attributes, =E2=80=A6=C2=AD PRIMARY KEY (event_id))
>
> CREATE TABLE event_index (index_key text, time_token timeuuid, event_id
> UUID,   PRIMARY KEY (index_key, time_token))
>
> In my program, given the proper index_key and a token range
> (tokenLowerBound to tokenUpperBound), I first query the event_index table
>
> *Query 1:*
>
> SELECT * FROM event_index WHERE index_key in (=E2=80=A6=C2=AD) AND time_t=
oken >
> tokenLowerBound AND time_token <=3D tokenUpperBound ORDER BY time_token A=
SC
> LIMIT 2000
>
> to get a list of event_ids and then run the following CQL to get the even=
t
> details.
>
> *Query 2:*
>
> SELECT * FROM event WHERE event_id IN (a list of event_ids from the above
> query)
>
> I repeat the above process, with updated token range from the previous
> run. This actually performs pretty well.
>
> In this normalized process, I have to *run 2 queries* to get data: the
> first one should be very quick since it is getting a slice of an internal=
ly
> wide row. The second query may take long because it needs to hit up to 20=
00
> rows of event table.
>
> *De-normalized case:*
>
> What if we can attach event detail to the index and run just 1 query? Lik=
e
> Query 1, would it be much faster since it is also getting a slice of an
> internally wide row?
>
> I created a third table that merged the above two tables together. Notice
> the first three attributes and the PRIMARY KEY definition are exactly the
> same as the "event_index" table.
>
> CREATE TABLE event_index_with_detail (index_key text, time_token timeuuid=
,
> event_id UUID, =E2=80=A6 30 other attributes, =E2=80=A6=C2=AD PRIMARY KEY=
 (index_key,
> time_token))
>
> Then I can just run the following query to achieve my goal, with the same
> index and token range as in query 1:
>
> *Query 3:*
>
> SELECT * FROM event_index_with_detail WHERE index_key in (=E2=80=A6=C2=AD=
) AND
> time_token > tokenLowerBound AND time_token <=3D tokenUpperBound ORDER BY
> time_token ASC LIMIT 2000
>
> *Performance observations*
>
> Using Java Driver 2.0.8, I wrote a program that runs Query 1 + Query 2 in
> the normalized case, or Query 3 in the denormalized case. All queries is
> set with LOCAL_QUORUM consistency level.
>
> Then I created 1 or more instances of the program to simultaneously
> retrieve the SAME set of 1 million events stored in Cassandra. Each test
> runs for 5 minutes, and the results are shown below.
>
>
>
> 1 instance
>
> 5 instances
>
> 10 instances
>
> Normalized
>
> 89
>
> 315
>
> 417
>
> Denormalized
>
> 100
>
> *43*
>
> *3*
>
> Note that the unit of measure is number of operations. So in the
> normalized case, the programs runs 89 times and retrieves 178K events for=
 a
> single instance, 315 times and 630K events to 5 instances (each instance
> gets about 126K events), and 417 times and 834K events to 10 instances
> simultaneously (each instance gets about 83.4K events).
>
> Well for the de-normalized case, the performance is little better for a
> single instance case, in which the program runs 100 times and retrieves
> 200K events. However, it turns sharply south for multiple simultaneous
> instances. All 5 instances completed successfully only 43 operations
> together, and all 10 instances completed successfully only 3 operations
> together. For the latter case, the log showed that 3 instances each
> retrieved 2000 events successfully, and 7 other instances retrieved 0.
>
> In the de-normalized case, the program reported a lot of exceptions like
> below:
>
> com.datastax.driver.core.exceptions.ReadTimeoutException, Cassandra
> timeout during read query at consistency LOCAL_QUORUM (2 responses were
> required but only 1 replica responded)
>
> com.datastax.driver.core.exceptions.NoHostAvailableException, All host(s)
> tried for query failed (no host was tried)
>
> I repeated the two cases back and forth several times, and the results
> remained the same.
>
> I also observed CPU usage on the 3 Cassandra servers, and they were all
> much higher for the de-normalized case.
>
>
>
> 1 instance
>
> 5 instances
>
> 10 instances
>
> Normalized
>
> 7% usr, 2% sys
>
> 30% usr, 8% sys
>
> 40% usr, 10% sys
>
> Denormalized
>
> 44% usr, 0.3% sys
>
> 65% usr, 1% sys
>
> 70% usr, 2% sys
>
> *Questions*
>
> This is really not what I expected, and I am puzzled and have not figured
> out a good explanation.
>
>    - Why are there so many exceptions in the de-normalized case? I would
>    think Cassandra should be able to handle simultaneous accesses to the =
same
>    data. Why are there NO exceptions for the normalized case? I meant tha=
t the
>    environments for the two cases are basically the same.
>    - Is (internally) wide row only good for small amount of data under
>    each column name?
>    - Or is it an issue with Java Driver?
>    - Or did I do something wrong?
>
>
> ------------------------------
> View this message in context: Denormalization leads to terrible, rather
> than better, Cassandra performance -- I am really puzzled
> <http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Denorma=
lization-leads-to-terrible-rather-than-better-Cassandra-performance-I-am-re=
ally-puzzled-tp7600561.html>
> Sent from the cassandra-user@incubator.apache.org mailing list archive
> <http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/> at
> Nabble.com.
>

--047d7bdc157ac29fea0515357ce5
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Hello, there.<div><br></div><div>In relation to the Java d=
river, I would recommend updating to the latest version as there were a lot=
 of issues reported in versions earlier that 2.0.9 were the driver is incor=
rectly marking nodes as down/not available.</div><div><br></div><div>In fac=
t, there is a new version of the driver being released in the next 24-48 ho=
urs that reverts=C2=A0<span style=3D"font-size:12.8000001907349px">JAVA-425=
 to resolve this issue.</span></div><div class=3D"gmail_extra"><br clear=3D=
"all"><div><div class=3D"gmail_signature"><div dir=3D"ltr">Cheers,<br>Erick=
<br><br><font size=3D"4" style=3D"color:rgb(68,68,68)" face=3D"verdana, san=
s-serif"><b>Erick Ramirez</b></font><br><div><div style=3D"text-align:left"=
><font color=3D"#b45f06" style=3D"font-family:&#39;trebuchet ms&#39;,sans-s=
erif">About Me</font><font color=3D"#444444" style=3D"font-family:&#39;treb=
uchet ms&#39;,sans-serif">=C2=A0<a href=3D"http://about.me/erickramirezonli=
ne" target=3D"_blank"><font color=3D"#444444">about.me/erickramirezonline</=
font></a></font><font face=3D"trebuchet ms, sans-serif"><br></font></div><f=
ont face=3D"trebuchet ms, sans-serif"><br><font color=3D"#444444">Make a di=
fference today!<br>* <a href=3D"http://on.mash.to/1vZL7fX" target=3D"_blank=
">Reduce your carbon footprint</a><br>* <a href=3D"http://www.govolunteer.c=
om.au" target=3D"_blank">Give back to the community</a><br>* <a href=3D"htt=
p://www.opensource.org" target=3D"_blank">Write free software</a></font></f=
ont><br><div><br></div></div></div></div></div>
<br><div class=3D"gmail_quote">On Wed, Apr 29, 2015 at 4:56 AM, dlu66061 <s=
pan dir=3D"ltr">&lt;<a href=3D"mailto:dlu66061@yahoo.com" target=3D"_blank"=
>dlu66061@yahoo.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quo=
te" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"=
><p>Cassandra gurus, I am really puzzled by my observations,
and hope to get some help explaining the results. Thanks in advance.</p>

<p>I think it has always been advocated in Cassandra
community that de-normalization leads to better performance. I wanted to se=
e
how much performance improvement it can offer, but the results were totally
opposite. The performance degraded dramatically for simultaneously requests=
 for
the same set of data.</p>

<p><b>Environment:</b></p>

<p>I have a Cassandra cluster consisting of 3 AWS m3.large
instances, with Cassandra 2.0.6 installed and pretty much default settings.=
 My program
is written in Java using Java Driver 2.0.8.</p>

<p><b>Normalized case:</b></p>

<p>I have two tables created with the following 2 CQL
statements</p>

<p style=3D"margin-left:1.0in">CREATE TABLE event (event_id
UUID, time_token timeuuid, =E2=80=A6=C2=AD 30 other attributes, =E2=80=A6=
=C2=AD PRIMARY KEY (event_id))</p>

<p style=3D"margin-left:1.0in">CREATE TABLE event_index
(index_key text, time_token timeuuid, event_id UUID,=C2=A0=C2=A0 PRIMARY KE=
Y
(index_key, time_token)) </p>

<p>In my program, given the proper index_key and a token
range (tokenLowerBound to tokenUpperBound), I first query the event_index t=
able</p>

<p style=3D"margin-left:.5in"><i>Query 1:</i></p>

<p style=3D"margin-left:1.0in">SELECT * FROM event_index WHERE
index_key in (=E2=80=A6=C2=AD) AND time_token &gt; tokenLowerBound AND time=
_token &lt;=3D
tokenUpperBound ORDER BY time_token ASC LIMIT 2000</p>

<p>to get a list of event_ids and then run the following CQL
to get the event details.</p>

<p style=3D"margin-left:.5in"><i>Query 2:</i></p>

<p style=3D"margin-left:1.0in">SELECT * FROM event WHERE
event_id IN (a list of event_ids from the above query)</p>

<p>I repeat the above process, with updated token range from
the previous run. This actually performs pretty well.</p>

<p>In this normalized process, I have to <b>run 2 queries</b>
to get data: the first one should be very quick since it is getting a slice=
 of
an internally wide row. The second query may take long because it needs to =
hit
up to 2000 rows of event table.</p>

<p><b>De-normalized case:</b></p>

<p>What if we can attach event detail to the index and run
just 1 query? Like Query 1, would it be much faster since it is also gettin=
g a
slice of an internally wide row?</p>

<p>I created a third table that merged the above two tables
together. Notice the first three attributes and the PRIMARY KEY definition =
are
exactly the same as the &quot;event_index&quot; table.</p>

<p style=3D"margin-left:1.0in">CREATE TABLE
event_index_with_detail (index_key text, time_token timeuuid, event_id UUID=
, =E2=80=A6
30 other attributes, =E2=80=A6=C2=AD PRIMARY KEY (index_key, time_token))</=
p>

<p>Then I can just run the following query to achieve my
goal, with the same index and token range as in query 1:</p>

<p style=3D"margin-left:.5in"><i>Query 3:</i></p>

<p style=3D"margin-left:1.0in">SELECT * FROM
event_index_with_detail WHERE index_key in (=E2=80=A6=C2=AD) AND time_token=
 &gt;
tokenLowerBound AND time_token &lt;=3D tokenUpperBound ORDER BY time_token =
ASC
LIMIT 2000</p>

<p><b>Performance observations</b></p>

<p>Using Java Driver 2.0.8, I wrote a program that runs
Query 1 + Query 2 in the normalized case, or Query 3 in the denormalized ca=
se.
All queries is set with LOCAL_QUORUM consistency level.</p>

<p>Then I created 1 or more instances of the program to
simultaneously retrieve the SAME set of 1 million events stored in Cassandr=
a.
Each test runs for 5 minutes, and the results are shown below.</p>

<table border=3D"1" cellspacing=3D"0" cellpadding=3D"0" style=3D"margin-lef=
t:94.1pt;border-collapse:collapse;border:none">
 <tbody><tr>
  <td width=3D"132" valign=3D"top" style=3D"width:99.0pt;border:solid windo=
wtext 1.0pt;padding:0in 5.4pt 0in 5.4pt">
  <p>=C2=A0</p>
  </td>
  <td width=3D"111" valign=3D"top" style=3D"width:83.4pt;border:solid windo=
wtext 1.0pt;border-left:none;padding:0in 5.4pt 0in 5.4pt">
  <p>1 instance</p>
  </td>
  <td width=3D"99" valign=3D"top" style=3D"width:74.1pt;border:solid window=
text 1.0pt;border-left:none;padding:0in 5.4pt 0in 5.4pt">
  <p>5 instances</p>
  </td>
  <td width=3D"114" valign=3D"top" style=3D"width:85.5pt;border:solid windo=
wtext 1.0pt;border-left:none;padding:0in 5.4pt 0in 5.4pt">
  <p>10 instances</p>
  </td>
 </tr>
 <tr>
  <td width=3D"132" valign=3D"top" style=3D"width:99.0pt;border:solid windo=
wtext 1.0pt;border-top:none;padding:0in 5.4pt 0in 5.4pt">
  <p>Normalized</p>
  </td>
  <td width=3D"111" valign=3D"top" style=3D"width:83.4pt;border-top:none;bo=
rder-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid wind=
owtext 1.0pt;padding:0in 5.4pt 0in 5.4pt">
  <p>89</p>
  </td>
  <td width=3D"99" valign=3D"top" style=3D"width:74.1pt;border-top:none;bor=
der-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windo=
wtext 1.0pt;padding:0in 5.4pt 0in 5.4pt">
  <p>315</p>
  </td>
  <td width=3D"114" valign=3D"top" style=3D"width:85.5pt;border-top:none;bo=
rder-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid wind=
owtext 1.0pt;padding:0in 5.4pt 0in 5.4pt">
  <p>417</p>
  </td>
 </tr>
 <tr>
  <td width=3D"132" valign=3D"top" style=3D"width:99.0pt;border:solid windo=
wtext 1.0pt;border-top:none;padding:0in 5.4pt 0in 5.4pt">
  <p>Denormalized</p>
  </td>
  <td width=3D"111" valign=3D"top" style=3D"width:83.4pt;border-top:none;bo=
rder-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid wind=
owtext 1.0pt;padding:0in 5.4pt 0in 5.4pt">
  <p>100</p>
  </td>
  <td width=3D"99" valign=3D"top" style=3D"width:74.1pt;border-top:none;bor=
der-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windo=
wtext 1.0pt;padding:0in 5.4pt 0in 5.4pt">
  <p><b>43</b></p>
  </td>
  <td width=3D"114" valign=3D"top" style=3D"width:85.5pt;border-top:none;bo=
rder-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid wind=
owtext 1.0pt;padding:0in 5.4pt 0in 5.4pt">
  <p><b>3</b></p>
  </td>
 </tr>
</tbody></table>

<p>Note that the unit of measure is number of operations. So
in the normalized case, the programs runs 89 times and retrieves 178K event=
s for
a single instance, 315 times and 630K events to 5 instances (each instance =
gets
about 126K events), and 417 times and 834K events to 10 instances
simultaneously (each instance gets about 83.4K events).</p>

<p>Well for the de-normalized case, the performance is
little better for a single instance case, in which the program runs 100 tim=
es
and retrieves 200K events. However, it turns sharply south for multiple
simultaneous instances. All 5 instances completed successfully only 43
operations together, and all 10 instances completed successfully only 3
operations together. For the latter case, the log showed that 3 instances e=
ach
retrieved 2000 events successfully, and 7 other instances retrieved 0.</p>

<p>In the de-normalized case, the program reported a lot of
exceptions like below:</p>

<p style=3D"margin-left:1.0in">com.datastax.driver.core.exceptions.ReadTime=
outException,
Cassandra timeout during read query at consistency LOCAL_QUORUM (2 response=
s
were required but only 1 replica responded)</p>

<p style=3D"margin-left:1.0in">com.datastax.driver.core.exceptions.NoHostAv=
ailableException,
All host(s) tried for query failed (no host was tried)</p>

<p>I repeated the two cases back and forth several times,
and the results remained the same.</p>

<p>I also observed CPU usage on the 3 Cassandra servers, and
they were all much higher for the de-normalized case.</p>

<table border=3D"1" cellspacing=3D"0" cellpadding=3D"0" style=3D"margin-lef=
t:94.1pt;border-collapse:collapse;border:none">
 <tbody><tr>
  <td width=3D"132" valign=3D"top" style=3D"width:99.0pt;border:solid windo=
wtext 1.0pt;padding:0in 5.4pt 0in 5.4pt">
  <p>=C2=A0</p>
  </td>
  <td width=3D"126" valign=3D"top" style=3D"width:94.5pt;border:solid windo=
wtext 1.0pt;border-left:none;padding:0in 5.4pt 0in 5.4pt">
  <p>1 instance</p>
  </td>
  <td width=3D"126" valign=3D"top" style=3D"width:94.5pt;border:solid windo=
wtext 1.0pt;border-left:none;padding:0in 5.4pt 0in 5.4pt">
  <p>5 instances</p>
  </td>
  <td width=3D"126" valign=3D"top" style=3D"width:94.5pt;border:solid windo=
wtext 1.0pt;border-left:none;padding:0in 5.4pt 0in 5.4pt">
  <p>10 instances</p>
  </td>
 </tr>
 <tr>
  <td width=3D"132" valign=3D"top" style=3D"width:99.0pt;border:solid windo=
wtext 1.0pt;border-top:none;padding:0in 5.4pt 0in 5.4pt">
  <p>Normalized</p>
  </td>
  <td width=3D"126" valign=3D"top" style=3D"width:94.5pt;border-top:none;bo=
rder-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid wind=
owtext 1.0pt;padding:0in 5.4pt 0in 5.4pt">
  <p>7% usr, 2% sys</p>
  </td>
  <td width=3D"126" valign=3D"top" style=3D"width:94.5pt;border-top:none;bo=
rder-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid wind=
owtext 1.0pt;padding:0in 5.4pt 0in 5.4pt">
  <p>30% usr, 8% sys</p>
  </td>
  <td width=3D"126" valign=3D"top" style=3D"width:94.5pt;border-top:none;bo=
rder-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid wind=
owtext 1.0pt;padding:0in 5.4pt 0in 5.4pt">
  <p>40% usr, 10% sys</p>
  </td>
 </tr>
 <tr>
  <td width=3D"132" valign=3D"top" style=3D"width:99.0pt;border:solid windo=
wtext 1.0pt;border-top:none;padding:0in 5.4pt 0in 5.4pt">
  <p>Denormalized</p>
  </td>
  <td width=3D"126" valign=3D"top" style=3D"width:94.5pt;border-top:none;bo=
rder-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid wind=
owtext 1.0pt;padding:0in 5.4pt 0in 5.4pt">
  <p>44% usr, 0.3% sys</p>
  </td>
  <td width=3D"126" valign=3D"top" style=3D"width:94.5pt;border-top:none;bo=
rder-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid wind=
owtext 1.0pt;padding:0in 5.4pt 0in 5.4pt">
  <p>65% usr, 1% sys</p>
  </td>
  <td width=3D"126" valign=3D"top" style=3D"width:94.5pt;border-top:none;bo=
rder-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid wind=
owtext 1.0pt;padding:0in 5.4pt 0in 5.4pt">
  <p>70% usr, 2% sys</p>
  </td>
 </tr>
</tbody></table>

<p><b>Questions</b></p>

<p>This is really not what I expected, and I am puzzled and
have not figured out a good explanation.</p>

<ul>
<li>Why are there so many exceptions in the de-normalized case? I
would think Cassandra should be able to handle simultaneous accesses to the
same data. Why are there NO exceptions for the normalized case? I meant tha=
t the environments for the two cases are basically the same.</li>

<li>Is (internally) wide row only good for small amount of data under
each column name?</li>

<li>Or is it an issue with Java Driver?</li>

<li>Or did I do something wrong?</li>
</ul>


=09
=09
=09
<br><hr align=3D"left" width=3D"300">
View this message in context: <a href=3D"http://cassandra-user-incubator-ap=
ache-org.3065146.n2.nabble.com/Denormalization-leads-to-terrible-rather-tha=
n-better-Cassandra-performance-I-am-really-puzzled-tp7600561.html" target=
=3D"_blank">Denormalization leads to terrible, rather than better, Cassandr=
a performance -- I am really puzzled</a><br>
Sent from the <a href=3D"http://cassandra-user-incubator-apache-org.3065146=
.n2.nabble.com/" target=3D"_blank">cassandra-user@incubator.apache.org mail=
ing list archive</a> at Nabble.com.<br></blockquote></div><br></div></div>

--047d7bdc157ac29fea0515357ce5--