Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: local policy)
From: aaron morton <aaron@thelastpickle.com>
Content-Type: multipart/alternative;
 boundary="Apple-Mail=_5C65A0E5-7E4E-4487-932B-3122D2076851"
Message-Id: <0DB802F2-9874-4D52-B2DF-4F9817F19141@thelastpickle.com>
Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\))
Subject: Re: Cassandra as storage for cache data
Date: Thu, 27 Jun 2013 16:51:40 +1200
References: <51C98DA0.1020006@gridnine.com>
 <841A5FC1-FBCB-4664-8825-CE497D9FED4F@gmail.com>
 <CAEPxca1569EtZ0qt0VWT4ZvZxFRNHNbtJ9xkMN7DE0Zqp1brvA@mail.gmail.com>
To: user@cassandra.apache.org
In-Reply-To: 
 <CAEPxca1569EtZ0qt0VWT4ZvZxFRNHNbtJ9xkMN7DE0Zqp1brvA@mail.gmail.com>


--Apple-Mail=_5C65A0E5-7E4E-4487-932B-3122D2076851
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=iso-8859-1

I'll also add that you are probably running into some memory issues, 2.5 =
GB is a low heap size=20

> -Xms2500M -Xmx2500M -Xmn400M

If you really do have a cache and want to reduce the disk activity =
disable durable_writes on the KS. That will stop the writes from going =
to the commit log which is one reason memtables are flushed to disk. The =
other reason is because the memory usage approaches the =
memtable_total_space_in_mb setting. Modern (1.2) releases are very good =
at managing the memory provided the jamm meter is working. With this =
approach and the other tips below you should be able to get better =
performance.=20

WARNING: disabling durable_writes means that writes are only in memory =
and will not be committed to disk until the CF's are flushed. You should =
*always* use nodetool drain before shutting down a node in this case.=20

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 26/06/2013, at 8:52 AM, sankalp kohli <kohlisankalp@gmail.com> wrote:

> Apart from what Jeremy said, you can try these
> 1) Use replication =3D 1. It is cache data and you dont need =
persistence.=20
> 2) Try playing with memtable size.
> 3) Use netflix client library as it will reduce one hop. It will chose =
the node with data as the co ordinator.=20
> 4) Work on your schema. You might want to have fewer columns in each =
row. With fatter rows, bloom filter will give out more sstables which =
are eligible.=20
>=20
> -Sankalp
>=20
>=20
> On Tue, Jun 25, 2013 at 9:04 AM, Jeremy Hanna =
<jeremy.hanna1234@gmail.com> wrote:
> If you have rapidly expiring data, then tombstones are probably =
filling your disk and your heap (depending on how you order the data on =
disk).  To check to see if your queries are affected by tombstones, you =
might try using the query tracing that's built-in to 1.2.
> See:
> =
http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-=
like-datasets  -- has an example of tracing where you can see tombstones =
affecting the query
> http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2
>=20
> You'll want to consider reducing the gc_grace period from the default =
of 10 days for those column families - with the understanding why =
gc_grace exists in the first place, see =
http://wiki.apache.org/cassandra/DistributedDeletes .  Then once the =
gc_grace period has passed, the tombstones will stay around until they =
are compacted away.  So there are two options currently to compact them =
away more quickly:
> 1) use leveled compaction - see =
http://www.datastax.com/dev/blog/when-to-use-leveled-compaction  Leveled =
compaction only requires 10% headroom (as opposed to 50% for size tiered =
compaction) for amount of disk that needs to be kept free.
> 2) if 1 doesn't work and you're still seeing performance degrading and =
the tombstones aren't getting cleared out fast enough, you might =
consider using size tiered compaction but performing regular major =
compactions to get rid of expired data.
>=20
> Keep in mind though that if you use gc_grace of 0 and do any kind of =
manual deletes outside of TTLs, you probably want to do the deletes at =
ConsistencyLevel.ALL or else if a node goes down, then comes back up, =
there's a chance that deleted data may be resurrected.  That only =
applies to non-ttl data where you manually delete it.  See the =
explanation of distributed deletes for more information.
>=20
>=20
> On 25 Jun 2013, at 13:31, Dmitry Olshansky =
<dmitry.olshansky@gridnine.com> wrote:
>=20
> > Hello,
> >
> > we are using Cassandra as a data storage for our caching system. Our =
application generates about 20 put and get requests per second. An =
average size of one cache item is about 500 Kb.
> >
> > Cache items are placed into one column family with TTL set to 20 - =
60 minutes. Keys and values are bytes (not utf8 strings). Compaction =
strategy is SizeTieredCompactionStrategy.
> >
> > We setup Cassandra 1.2.6 cluster of 4 nodes. Replication factor is =
2. Each node has 10GB of RAM and enough space on HDD.
> >
> > Now when we're putting this cluster into the load it's quickly fills =
with our runtime data (about 5 GB on every node) and we start observing =
performance degradation with often timeouts on client side.
> >
> > We see that on each node compaction starts very frequently and lasts =
for several minutes to complete. It seems that each node usually busy =
with compaction process.
> >
> > Here the questions:
> >
> > What are the recommended setup configuration for our use case?
> >
> > Is it makes sense to somehow tell Cassandra to keep all data in =
memory (memtables) to eliminate flushing it to disk (sstables) thus =
decreasing number of compactions? How to achieve this behavior?
> >
> > Cassandra is starting with default shell script that gives the =
following command line:
> >
> > jsvc.exec -user cassandra -home =
/usr/lib/jvm/java-6-openjdk-amd64/jre/bin/../ -pidfile =
/var/run/cassandra.pid -errfile &1 -outfile =
/var/log/cassandra/output.log -cp <CLASSPATH_SKIPPED> =
-Dlog4j.configuration=3Dlog4j-server.properties =
-Dlog4j.defaultInitOverride=3Dtrue =
-XX:HeapDumpPath=3D/var/lib/cassandra/java_1371805844.hprof =
-XX:ErrorFile=3D/var/lib/cassandra/hs_err_1371805844.log -ea =
-javaagent:/usr/share/cassandra/lib/jamm-0.2.5.jar =
-XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=3D42 -Xms2500M =
-Xmx2500M -Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss180k =
-XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled =
-XX:SurvivorRatio=3D8 -XX:MaxTenuringThreshold=3D1 =
-XX:CMSInitiatingOccupancyFraction=3D75 =
-XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB =
-Djava.net.preferIPv4Stack=3Dtrue =
-Dcom.sun.management.jmxremote.port=3D7199 =
-Dcom.sun.management.jmxremote.ssl=3Dfalse =
-Dcom.sun.management.jmxremote.authenticate=3Dfalse =
org.apache.cassandra.service.CassandraDaemon
> >
> > --
> > Best regards,
> > Dmitry Olshansky
> >
>=20
>=20


--Apple-Mail=_5C65A0E5-7E4E-4487-932B-3122D2076851
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=iso-8859-1

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html =
charset=3Diso-8859-1"></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">I'll =
also add that you are probably running into some memory issues, 2.5 GB =
is a low heap size&nbsp;<div><br></div><div><blockquote type=3D"cite"><div=
 class=3D"gmail_extra"><div class=3D"gmail_quote"><blockquote =
class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8ex; =
border-left-width: 1px; border-left-color: rgb(204, 204, 204); =
border-left-style: solid; padding-left: 1ex; position: static; z-index: =
auto; "><div class=3D"HOEnZb"><div class=3D"h5">-Xms2500M -Xmx2500M =
-Xmn400M</div></div></blockquote></div></div></blockquote><div><br></div><=
div>If you really do have a cache and want to reduce the disk activity =
disable durable_writes on the KS. That will stop the writes from going =
to the commit log which is one reason memtables are flushed to disk. The =
other reason is because the memory usage approaches =
the&nbsp;memtable_total_space_in_mb setting. Modern (1.2) releases are =
very good at managing the memory provided the jamm meter is working. =
With this approach and the other tips below you should be able to get =
better performance.&nbsp;</div><div><br></div><div>WARNING: disabling =
durable_writes means that writes are only in memory and will not be =
committed to disk until the CF's are flushed. You should *always* use =
nodetool drain before shutting down a node in this =
case.&nbsp;</div><div><br></div><div>Cheers</div><div><br></div><div =
apple-content-edited=3D"true">
<div style=3D"color: rgb(0, 0, 0); font-family: Helvetica; font-size: =
medium; font-style: normal; font-variant: normal; font-weight: normal; =
letter-spacing: normal; line-height: normal; orphans: 2; text-align: =
-webkit-auto; text-indent: 0px; text-transform: none; white-space: =
normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; =
-webkit-text-stroke-width: 0px; word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div =
style=3D"color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; =
font-style: normal; font-variant: normal; font-weight: normal; =
letter-spacing: normal; line-height: normal; orphans: 2; text-align: =
-webkit-auto; text-indent: 0px; text-transform: none; white-space: =
normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; =
-webkit-text-stroke-width: 0px; word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; =
border-spacing: 0px; "><div style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: =
normal; font-weight: normal; letter-spacing: normal; line-height: =
normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: =
normal; widows: 2; word-spacing: 0px; border-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; color: rgb(0, 0, 0); font-family: =
Helvetica; font-style: normal; font-variant: normal; font-weight: =
normal; letter-spacing: normal; line-height: normal; orphans: 2; =
text-indent: 0px; text-transform: none; white-space: normal; widows: 2; =
word-spacing: 0px; border-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; color: rgb(0, 0, 0); font-family: =
Helvetica; font-style: normal; font-variant: normal; font-weight: =
normal; letter-spacing: normal; line-height: normal; orphans: 2; =
text-indent: 0px; text-transform: none; white-space: normal; widows: 2; =
word-spacing: 0px; border-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; =
"><div>-----------------</div><div>Aaron Morton</div><div>Freelance =
Cassandra Consultant</div><div>New =
Zealand</div><div><br></div><div>@aaronmorton</div><div><a =
href=3D"http://www.thelastpickle.com">http://www.thelastpickle.com</a></di=
v></div></span></div></span></div></span></div></span></div></div>
</div>
<br><div><div>On 26/06/2013, at 8:52 AM, sankalp kohli &lt;<a =
href=3D"mailto:kohlisankalp@gmail.com">kohlisankalp@gmail.com</a>&gt; =
wrote:</div><br class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite"><div dir=3D"ltr">Apart from what Jeremy said, you can try =
these<div style=3D"">1) Use replication =3D 1. It is cache data and you =
dont need persistence.&nbsp;</div><div style=3D"">2) Try playing with =
memtable size.</div><div style=3D"">3) Use netflix client library as it =
will reduce one hop. It will chose the node with data as the co =
ordinator.&nbsp;</div>

<div style=3D"">4) Work on your schema. You might want to have fewer =
columns in each row. With fatter rows, bloom filter will give out more =
sstables which are eligible.&nbsp;</div><div style=3D""><br></div><div =
style=3D"">-Sankalp</div></div>

<div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Tue, =
Jun 25, 2013 at 9:04 AM, Jeremy Hanna <span dir=3D"ltr">&lt;<a =
href=3D"mailto:jeremy.hanna1234@gmail.com" =
target=3D"_blank">jeremy.hanna1234@gmail.com</a>&gt;</span> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8ex; =
border-left-width: 1px; border-left-color: rgb(204, 204, 204); =
border-left-style: solid; padding-left: 1ex; position: static; z-index: =
auto; ">If you have rapidly expiring data, then tombstones are probably =
filling your disk and your heap (depending on how you order the data on =
disk). &nbsp;To check to see if your queries are affected by tombstones, =
you might try using the query tracing that's built-in to 1.2.<br>


See:<br>
<a =
href=3D"http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-an=
d-queue-like-datasets" =
target=3D"_blank">http://www.datastax.com/dev/blog/cassandra-anti-patterns=
-queues-and-queue-like-datasets</a> &nbsp;-- has an example of tracing =
where you can see tombstones affecting the query<br>


<a href=3D"http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2" =
target=3D"_blank">http://www.datastax.com/dev/blog/tracing-in-cassandra-1-=
2</a><br>
<br>
You'll want to consider reducing the gc_grace period from the default of =
10 days for those column families - with the understanding why gc_grace =
exists in the first place, see <a =
href=3D"http://wiki.apache.org/cassandra/DistributedDeletes" =
target=3D"_blank">http://wiki.apache.org/cassandra/DistributedDeletes</a> =
. &nbsp;Then once the gc_grace period has passed, the tombstones will =
stay around until they are compacted away. &nbsp;So there are two =
options currently to compact them away more quickly:<br>


1) use leveled compaction - see <a =
href=3D"http://www.datastax.com/dev/blog/when-to-use-leveled-compaction" =
target=3D"_blank">http://www.datastax.com/dev/blog/when-to-use-leveled-com=
paction</a> &nbsp;Leveled compaction only requires 10% headroom (as =
opposed to 50% for size tiered compaction) for amount of disk that needs =
to be kept free.<br>


2) if 1 doesn't work and you're still seeing performance degrading and =
the tombstones aren't getting cleared out fast enough, you might =
consider using size tiered compaction but performing regular major =
compactions to get rid of expired data.<br>


<br>
Keep in mind though that if you use gc_grace of 0 and do any kind of =
manual deletes outside of TTLs, you probably want to do the deletes at =
ConsistencyLevel.ALL or else if a node goes down, then comes back up, =
there's a chance that deleted data may be resurrected. &nbsp;That only =
applies to non-ttl data where you manually delete it. &nbsp;See the =
explanation of distributed deletes for more information.<br>


<div class=3D"HOEnZb"><div class=3D"h5"><br>
<br>
On 25 Jun 2013, at 13:31, Dmitry Olshansky &lt;<a =
href=3D"mailto:dmitry.olshansky@gridnine.com">dmitry.olshansky@gridnine.co=
m</a>&gt; wrote:<br>
<br>
&gt; Hello,<br>
&gt;<br>
&gt; we are using Cassandra as a data storage for our caching system. =
Our application generates about 20 put and get requests per second. An =
average size of one cache item is about 500 Kb.<br>
&gt;<br>
&gt; Cache items are placed into one column family with TTL set to 20 - =
60 minutes. Keys and values are bytes (not utf8 strings). Compaction =
strategy is SizeTieredCompactionStrategy.<br>
&gt;<br>
&gt; We setup Cassandra 1.2.6 cluster of 4 nodes. Replication factor is =
2. Each node has 10GB of RAM and enough space on HDD.<br>
&gt;<br>
&gt; Now when we're putting this cluster into the load it's quickly =
fills with our runtime data (about 5 GB on every node) and we start =
observing performance degradation with often timeouts on client =
side.<br>
&gt;<br>
&gt; We see that on each node compaction starts very frequently and =
lasts for several minutes to complete. It seems that each node usually =
busy with compaction process.<br>
&gt;<br>
&gt; Here the questions:<br>
&gt;<br>
&gt; What are the recommended setup configuration for our use case?<br>
&gt;<br>
&gt; Is it makes sense to somehow tell Cassandra to keep all data in =
memory (memtables) to eliminate flushing it to disk (sstables) thus =
decreasing number of compactions? How to achieve this behavior?<br>
&gt;<br>
&gt; Cassandra is starting with default shell script that gives the =
following command line:<br>
&gt;<br>
&gt; jsvc.exec -user cassandra -home =
/usr/lib/jvm/java-6-openjdk-amd64/jre/bin/../ -pidfile =
/var/run/cassandra.pid -errfile &amp;1 -outfile =
/var/log/cassandra/output.log -cp &lt;CLASSPATH_SKIPPED&gt; =
-Dlog4j.configuration=3Dlog4j-server.properties =
-Dlog4j.defaultInitOverride=3Dtrue =
-XX:HeapDumpPath=3D/var/lib/cassandra/java_1371805844.hprof =
-XX:ErrorFile=3D/var/lib/cassandra/hs_err_1371805844.log -ea =
-javaagent:/usr/share/cassandra/lib/jamm-0.2.5.jar =
-XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=3D42 -Xms2500M =
-Xmx2500M -Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss180k =
-XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled =
-XX:SurvivorRatio=3D8 -XX:MaxTenuringThreshold=3D1 =
-XX:CMSInitiatingOccupancyFraction=3D75 =
-XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB =
-Djava.net.preferIPv4Stack=3Dtrue =
-Dcom.sun.management.jmxremote.port=3D7199 =
-Dcom.sun.management.jmxremote.ssl=3Dfalse =
-Dcom.sun.management.jmxremote.authenticate=3Dfalse =
org.apache.cassandra.service.CassandraDaemon<br>


&gt;<br>
&gt; --<br>
&gt; Best regards,<br>
&gt; Dmitry Olshansky<br>
&gt;<br>
<br>
</div></div></blockquote></div><br></div>
</blockquote></div><br></div></body></html>=

--Apple-Mail=_5C65A0E5-7E4E-4487-932B-3122D2076851--