Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: message received from 54.76.25.247 which
 is an MX secondary for cassandra-user@incubator.apache.org)
Date: Tue, 28 Apr 2015 11:56:04 -0700 (MST)
From: dlu66061 <dlu66061@yahoo.com>
To: cassandra-user@incubator.apache.org
Message-ID: <1430247364917-7600561.post@n2.nabble.com>
Subject: Denormalization leads to terrible, rather than better, Cassandra
 performance -- I am really puzzled
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----=_Part_21528_199927625.1430247364919"

------=_Part_21528_199927625.1430247364919
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Cassandra gurus, I am really puzzled by my observations,and hope to get som=
e
help explaining the results. Thanks in advance.
I think it has always been advocated in Cassandracommunity that
de-normalization leads to better performance. I wanted to seehow much
performance improvement it can offer, but the results were totallyopposite.
The performance degraded dramatically for simultaneously requests forthe
same set of data.
*Environment:*
I have a Cassandra cluster consisting of 3 AWS m3.largeinstances, with
Cassandra 2.0.6 installed and pretty much default settings. My programis
written in Java using Java Driver 2.0.8.
*Normalized case:*
I have two tables created with the following 2 CQLstatements
CREATE TABLE event (event_idUUID, time_token timeuuid, &hellip;=C2=AD 30 ot=
her
attributes, &hellip;=C2=AD PRIMARY KEY (event_id))
CREATE TABLE event_index(index_key text, time_token timeuuid, event_id
UUID,&nbsp;&nbsp; PRIMARY KEY(index_key, time_token))=20
In my program, given the proper index_key and a tokenrange (tokenLowerBound
to tokenUpperBound), I first query the event_index table
/Query 1:/
SELECT * FROM event_index WHEREindex_key in (&hellip;=C2=AD) AND time_token=
 &gt;
tokenLowerBound AND time_token &lt;=3DtokenUpperBound ORDER BY time_token A=
SC
LIMIT 2000
to get a list of event_ids and then run the following CQLto get the event
details.
/Query 2:/
SELECT * FROM event WHEREevent_id IN (a list of event_ids from the above
query)
I repeat the above process, with updated token range fromthe previous run.
This actually performs pretty well.
In this normalized process, I have to *run 2 queries*to get data: the first
one should be very quick since it is getting a slice ofan internally wide
row. The second query may take long because it needs to hitup to 2000 rows
of event table.
*De-normalized case:*
What if we can attach event detail to the index and runjust 1 query? Like
Query 1, would it be much faster since it is also getting aslice of an
internally wide row?
I created a third table that merged the above two tablestogether. Notice th=
e
first three attributes and the PRIMARY KEY definition areexactly the same a=
s
the "event_index" table.
CREATE TABLEevent_index_with_detail (index_key text, time_token timeuuid,
event_id UUID, &hellip;30 other attributes, &hellip;=C2=AD PRIMARY KEY
(index_key, time_token))
Then I can just run the following query to achieve mygoal, with the same
index and token range as in query 1:
/Query 3:/
SELECT * FROMevent_index_with_detail WHERE index_key in (&hellip;=C2=AD) AN=
D
time_token &gt;tokenLowerBound AND time_token &lt;=3D tokenUpperBound ORDER=
 BY
time_token ASCLIMIT 2000
*Performance observations*
Using Java Driver 2.0.8, I wrote a program that runsQuery 1 + Query 2 in th=
e
normalized case, or Query 3 in the denormalized case.All queries is set wit=
h
LOCAL_QUORUM consistency level.
Then I created 1 or more instances of the program tosimultaneously retrieve
the SAME set of 1 million events stored in Cassandra.Each test runs for 5
minutes, and the results are shown below.
=20
  =09 =20
&nbsp;
    =09 =20
1 instance
    =09 =20
5 instances
    =09 =20
10 instances
   =20
  =09 =20
Normalized
    =09 =20
89
    =09 =20
315
    =09 =20
417
   =20
  =09 =20
Denormalized
    =09 =20
100
    =09 =20
*43*
    =09 =20
*3*
  =20
Note that the unit of measure is number of operations. Soin the normalized
case, the programs runs 89 times and retrieves 178K events fora single
instance, 315 times and 630K events to 5 instances (each instance getsabout
126K events), and 417 times and 834K events to 10 instancessimultaneously
(each instance gets about 83.4K events).
Well for the de-normalized case, the performance islittle better for a
single instance case, in which the program runs 100 timesand retrieves 200K
events. However, it turns sharply south for multiplesimultaneous instances.
All 5 instances completed successfully only 43operations together, and all
10 instances completed successfully only 3operations together. For the
latter case, the log showed that 3 instances eachretrieved 2000 events
successfully, and 7 other instances retrieved 0.
In the de-normalized case, the program reported a lot ofexceptions like
below:
com.datastax.driver.core.exceptions.ReadTimeoutException,Cassandra timeout
during read query at consistency LOCAL_QUORUM (2 responseswere required but
only 1 replica responded)
com.datastax.driver.core.exceptions.NoHostAvailableException,All host(s)
tried for query failed (no host was tried)
I repeated the two cases back and forth several times,and the results
remained the same.
I also observed CPU usage on the 3 Cassandra servers, andthey were all much
higher for the de-normalized case.
=20
  =09 =20
&nbsp;
    =09 =20
1 instance
    =09 =20
5 instances
    =09 =20
10 instances
   =20
  =09 =20
Normalized
    =09 =20
7% usr, 2% sys
    =09 =20
30% usr, 8% sys
    =09 =20
40% usr, 10% sys
   =20
  =09 =20
Denormalized
    =09 =20
44% usr, 0.3% sys
    =09 =20
65% usr, 1% sys
    =09 =20
70% usr, 2% sys
  =20
*Questions*
This is really not what I expected, and I am puzzled andhave not figured ou=
t
a good explanation.
Why are there so many exceptions in the de-normalized case? Iwould think
Cassandra should be able to handle simultaneous accesses to thesame data.
Why are there NO exceptions for the normalized case? I meant that the
environments for the two cases are basically the same.
Is (internally) wide row only good for small amount of data undereach colum=
n
name?
Or is it an issue with Java Driver?
Or did I do something wrong?


--
View this message in context: http://cassandra-user-incubator-apache-org.30=
65146.n2.nabble.com/Denormalization-leads-to-terrible-rather-than-better-Ca=
ssandra-performance-I-am-really-puzzled-tp7600561.html
Sent from the cassandra-user@incubator.apache.org mailing list archive at N=
abble.com.
------=_Part_21528_199927625.1430247364919
Content-Type: text/html; charset=UTF8
Content-Transfer-Encoding: quoted-printable

<p class=3DMsoNoSpacing>Cassandra gurus, I am really puzzled by my observat=
ions,
and hope to get some help explaining the results. Thanks in advance.</p>

<p class=3DMsoNoSpacing>I think it has always been advocated in Cassandra
community that de-normalization leads to better performance. I wanted to se=
e
how much performance improvement it can offer, but the results were totally
opposite. The performance degraded dramatically for simultaneously requests=
 for
the same set of data.</p>

<p class=3DMsoNoSpacing><b>Environment:</b></p>

<p class=3DMsoNoSpacing>I have a Cassandra cluster consisting of 3 AWS m3.l=
arge
instances, with Cassandra 2.0.6 installed and pretty much default settings.=
 My program
is written in Java using Java Driver 2.0.8.</p>

<p class=3DMsoNoSpacing><b>Normalized case:</b></p>

<p class=3DMsoNoSpacing>I have two tables created with the following 2 CQL
statements</p>

<p class=3DMsoNoSpacing style=3D'margin-left:1.0in'>CREATE TABLE event (eve=
nt_id
UUID, time_token timeuuid, &hellip;=C2=AD 30 other attributes, &hellip;=C2=
=AD PRIMARY KEY (event_id))</p>

<p class=3DMsoNoSpacing style=3D'margin-left:1.0in'>CREATE TABLE event_inde=
x
(index_key text, time_token timeuuid, event_id UUID,&nbsp;&nbsp; PRIMARY KE=
Y
(index_key, time_token)) </p>

<p class=3DMsoNoSpacing>In my program, given the proper index_key and a tok=
en
range (tokenLowerBound to tokenUpperBound), I first query the event_index t=
able</p>

<p class=3DMsoNoSpacing style=3D'margin-left:.5in'><i>Query 1:</i></p>

<p class=3DMsoNoSpacing style=3D'margin-left:1.0in'>SELECT * FROM event_ind=
ex WHERE
index_key in (&hellip;=C2=AD) AND time_token &gt; tokenLowerBound AND time_=
token &lt;=3D
tokenUpperBound ORDER BY time_token ASC LIMIT 2000</p>

<p class=3DMsoNoSpacing>to get a list of event_ids and then run the followi=
ng CQL
to get the event details.</p>

<p class=3DMsoNoSpacing style=3D'margin-left:.5in'><i>Query 2:</i></p>

<p class=3DMsoNoSpacing style=3D'margin-left:1.0in'>SELECT * FROM event WHE=
RE
event_id IN (a list of event_ids from the above query)</p>

<p class=3DMsoNoSpacing>I repeat the above process, with updated token rang=
e from
the previous run. This actually performs pretty well.</p>

<p class=3DMsoNoSpacing>In this normalized process, I have to <b>run 2 quer=
ies</b>
to get data: the first one should be very quick since it is getting a slice=
 of
an internally wide row. The second query may take long because it needs to =
hit
up to 2000 rows of event table.</p>

<p class=3DMsoNoSpacing><b>De-normalized case:</b></p>

<p class=3DMsoNoSpacing>What if we can attach event detail to the index and=
 run
just 1 query? Like Query 1, would it be much faster since it is also gettin=
g a
slice of an internally wide row?</p>

<p class=3DMsoNoSpacing>I created a third table that merged the above two t=
ables
together. Notice the first three attributes and the PRIMARY KEY definition =
are
exactly the same as the "event_index" table.</p>

<p class=3DMsoNoSpacing style=3D'margin-left:1.0in'>CREATE TABLE
event_index_with_detail (index_key text, time_token timeuuid, event_id UUID=
, &hellip;
30 other attributes, &hellip;=C2=AD PRIMARY KEY (index_key, time_token))</p=
>

<p class=3DMsoNoSpacing>Then I can just run the following query to achieve =
my
goal, with the same index and token range as in query 1:</p>

<p class=3DMsoNoSpacing style=3D'margin-left:.5in'><i>Query 3:</i></p>

<p class=3DMsoNoSpacing style=3D'margin-left:1.0in'>SELECT * FROM
event_index_with_detail WHERE index_key in (&hellip;=C2=AD) AND time_token =
&gt;
tokenLowerBound AND time_token &lt;=3D tokenUpperBound ORDER BY time_token =
ASC
LIMIT 2000</p>

<p class=3DMsoNoSpacing><b>Performance observations</b></p>

<p class=3DMsoNoSpacing>Using Java Driver 2.0.8, I wrote a program that run=
s
Query 1 + Query 2 in the normalized case, or Query 3 in the denormalized ca=
se.
All queries is set with LOCAL_QUORUM consistency level.</p>

<p class=3DMsoNoSpacing>Then I created 1 or more instances of the program t=
o
simultaneously retrieve the SAME set of 1 million events stored in Cassandr=
a.
Each test runs for 5 minutes, and the results are shown below.</p>

<table class=3DMsoTableGrid border=3D1 cellspacing=3D0 cellpadding=3D0
 style=3D'margin-left:94.1pt;border-collapse:collapse;border:none'>
 <tr>
  <td width=3D132 valign=3Dtop style=3D'width:99.0pt;border:solid windowtex=
t 1.0pt;
  padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing>&nbsp;</p>
  </td>
  <td width=3D111 valign=3Dtop style=3D'width:83.4pt;border:solid windowtex=
t 1.0pt;
  border-left:none;padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing>1 instance</p>
  </td>
  <td width=3D99 valign=3Dtop style=3D'width:74.1pt;border:solid windowtext=
 1.0pt;
  border-left:none;padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing>5 instances</p>
  </td>
  <td width=3D114 valign=3Dtop style=3D'width:85.5pt;border:solid windowtex=
t 1.0pt;
  border-left:none;padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing>10 instances</p>
  </td>
 </tr>
 <tr>
  <td width=3D132 valign=3Dtop style=3D'width:99.0pt;border:solid windowtex=
t 1.0pt;
  border-top:none;padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing>Normalized</p>
  </td>
  <td width=3D111 valign=3Dtop style=3D'width:83.4pt;border-top:none;border=
-left:
  none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1=
.0pt;
  padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing>89</p>
  </td>
  <td width=3D99 valign=3Dtop style=3D'width:74.1pt;border-top:none;border-=
left:none;
  border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;
  padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing>315</p>
  </td>
  <td width=3D114 valign=3Dtop style=3D'width:85.5pt;border-top:none;border=
-left:
  none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1=
.0pt;
  padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing>417</p>
  </td>
 </tr>
 <tr>
  <td width=3D132 valign=3Dtop style=3D'width:99.0pt;border:solid windowtex=
t 1.0pt;
  border-top:none;padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing>Denormalized</p>
  </td>
  <td width=3D111 valign=3Dtop style=3D'width:83.4pt;border-top:none;border=
-left:
  none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1=
.0pt;
  padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing>100</p>
  </td>
  <td width=3D99 valign=3Dtop style=3D'width:74.1pt;border-top:none;border-=
left:none;
  border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;
  padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing><b>43</b></p>
  </td>
  <td width=3D114 valign=3Dtop style=3D'width:85.5pt;border-top:none;border=
-left:
  none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1=
.0pt;
  padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing><b>3</b></p>
  </td>
 </tr>
</table>

<p class=3DMsoNoSpacing>Note that the unit of measure is number of operatio=
ns. So
in the normalized case, the programs runs 89 times and retrieves 178K event=
s for
a single instance, 315 times and 630K events to 5 instances (each instance =
gets
about 126K events), and 417 times and 834K events to 10 instances
simultaneously (each instance gets about 83.4K events).</p>

<p class=3DMsoNoSpacing>Well for the de-normalized case, the performance is
little better for a single instance case, in which the program runs 100 tim=
es
and retrieves 200K events. However, it turns sharply south for multiple
simultaneous instances. All 5 instances completed successfully only 43
operations together, and all 10 instances completed successfully only 3
operations together. For the latter case, the log showed that 3 instances e=
ach
retrieved 2000 events successfully, and 7 other instances retrieved 0.</p>

<p class=3DMsoNoSpacing>In the de-normalized case, the program reported a l=
ot of
exceptions like below:</p>

<p class=3DMsoNoSpacing style=3D'margin-left:1.0in'>com.datastax.driver.cor=
e.exceptions.ReadTimeoutException,
Cassandra timeout during read query at consistency LOCAL_QUORUM (2 response=
s
were required but only 1 replica responded)</p>

<p class=3DMsoNoSpacing style=3D'margin-left:1.0in'>com.datastax.driver.cor=
e.exceptions.NoHostAvailableException,
All host(s) tried for query failed (no host was tried)</p>

<p class=3DMsoNoSpacing>I repeated the two cases back and forth several tim=
es,
and the results remained the same.</p>

<p class=3DMsoNoSpacing>I also observed CPU usage on the 3 Cassandra server=
s, and
they were all much higher for the de-normalized case.</p>

<table class=3DMsoTableGrid border=3D1 cellspacing=3D0 cellpadding=3D0
 style=3D'margin-left:94.1pt;border-collapse:collapse;border:none'>
 <tr>
  <td width=3D132 valign=3Dtop style=3D'width:99.0pt;border:solid windowtex=
t 1.0pt;
  padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing>&nbsp;</p>
  </td>
  <td width=3D126 valign=3Dtop style=3D'width:94.5pt;border:solid windowtex=
t 1.0pt;
  border-left:none;padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing>1 instance</p>
  </td>
  <td width=3D126 valign=3Dtop style=3D'width:94.5pt;border:solid windowtex=
t 1.0pt;
  border-left:none;padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing>5 instances</p>
  </td>
  <td width=3D126 valign=3Dtop style=3D'width:94.5pt;border:solid windowtex=
t 1.0pt;
  border-left:none;padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing>10 instances</p>
  </td>
 </tr>
 <tr>
  <td width=3D132 valign=3Dtop style=3D'width:99.0pt;border:solid windowtex=
t 1.0pt;
  border-top:none;padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing>Normalized</p>
  </td>
  <td width=3D126 valign=3Dtop style=3D'width:94.5pt;border-top:none;border=
-left:
  none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1=
.0pt;
  padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing>7% usr, 2% sys</p>
  </td>
  <td width=3D126 valign=3Dtop style=3D'width:94.5pt;border-top:none;border=
-left:
  none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1=
.0pt;
  padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing>30% usr, 8% sys</p>
  </td>
  <td width=3D126 valign=3Dtop style=3D'width:94.5pt;border-top:none;border=
-left:
  none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1=
.0pt;
  padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing>40% usr, 10% sys</p>
  </td>
 </tr>
 <tr>
  <td width=3D132 valign=3Dtop style=3D'width:99.0pt;border:solid windowtex=
t 1.0pt;
  border-top:none;padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing>Denormalized</p>
  </td>
  <td width=3D126 valign=3Dtop style=3D'width:94.5pt;border-top:none;border=
-left:
  none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1=
.0pt;
  padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing>44% usr, 0.3% sys</p>
  </td>
  <td width=3D126 valign=3Dtop style=3D'width:94.5pt;border-top:none;border=
-left:
  none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1=
.0pt;
  padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing>65% usr, 1% sys</p>
  </td>
  <td width=3D126 valign=3Dtop style=3D'width:94.5pt;border-top:none;border=
-left:
  none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1=
.0pt;
  padding:0in 5.4pt 0in 5.4pt'>
  <p class=3DMsoNoSpacing>70% usr, 2% sys</p>
  </td>
 </tr>
</table>

<p class=3DMsoNoSpacing><b>Questions</b></p>

<p class=3DMsoNoSpacing>This is really not what I expected, and I am puzzle=
d and
have not figured out a good explanation.</p>

<ul>
<li>Why are there so many exceptions in the de-normalized case? I
would think Cassandra should be able to handle simultaneous accesses to the
same data. Why are there NO exceptions for the normalized case? I meant tha=
t the environments for the two cases are basically the same.</li>

<li>Is (internally) wide row only good for small amount of data under
each column name?</li>

<li>Or is it an issue with Java Driver?</li>

<li>Or did I do something wrong?</li>
</ul>


=09
=09
=09
<br/><hr align=3D"left" width=3D"300" />
View this message in context: <a href=3D"http://cassandra-user-incubator-ap=
ache-org.3065146.n2.nabble.com/Denormalization-leads-to-terrible-rather-tha=
n-better-Cassandra-performance-I-am-really-puzzled-tp7600561.html">Denormal=
ization leads to terrible, rather than better, Cassandra performance -- I a=
m really puzzled</a><br/>
Sent from the <a href=3D"http://cassandra-user-incubator-apache-org.3065146=
.n2.nabble.com/">cassandra-user@incubator.apache.org mailing list archive</=
a> at Nabble.com.<br/>
------=_Part_21528_199927625.1430247364919--