Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: local policy includes SPF record at
 spf.trusted-forwarder.org)
MIME-Version: 1.0
In-Reply-To: <5278B48C.6070200@avast.com>
References: <5278B48C.6070200@avast.com>
Date: Tue, 5 Nov 2013 09:38:58 -0200
Message-ID: 
 <CAGvNKf+vRsPSpTvau5omHx6q_JZhRW9dJ5VZuWABKDcidC4UqQ@mail.gmail.com>
Subject: Re: Cass 2.0.0: Extensive memory allocation when row_cache enabled
From: =?ISO-8859-1?Q?S=E1vio_Teles?= <savio.teles@lupa.inf.ufg.br>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=001a11c22ac072976804ea6c7aa4

--001a11c22ac072976804ea6c7aa4
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

We have the same problem.


2013/11/5 Jiri Horky <horky@avast.com>

> Hi there,
>
> we are seeing extensive memory allocation leading to quite long and
> frequent GC pauses when using row cache. This is on cassandra 2.0.0
> cluster with JNA 4.0 library with following settings:
>
> key_cache_size_in_mb: 300
> key_cache_save_period: 14400
> row_cache_size_in_mb: 1024
> row_cache_save_period: 14400
> commitlog_sync: periodic
> commitlog_sync_period_in_ms: 10000
> commitlog_segment_size_in_mb: 32
>
> -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=3D42 -Xms10G -Xmx10G
> -Xmn1024M -XX:+HeapDumpOnOutOfMemoryError
>
> -XX:HeapDumpPath=3D/data2/cassandra-work/instance-1/cassandra-1383566283-=
pid1893.hprof
> -Xss180k -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
> -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=3D8
> -XX:MaxTenuringThreshold=3D1 -XX:CMSInitiatingOccupancyFraction=3D75
> -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:+UseCondCardMark
>
> We have disabled row cache on one node to see  the  difference. Please
> see attached plots from visual VM, I think that the effect is quite
> visible. I have also taken 10x "jmap -histo" after 5s on a affected
> server and plotted the result, attached as well.
>
> I have taken a dump of the application when the heap size was 10GB, most
> of the memory was unreachable, which was expected. The majority was used
> by 55-59M objects of HeapByteBuffer, byte[] and
> org.apache.cassandra.db.Column classes. I also include a list of inbound
> references to the HeapByteBuffer objects from which it should be visible
> where they are being allocated. This was acquired using Eclipse MAT.
>
> Here is the comparison of GC times when row cache enabled and disabled:
>
> prg01 - row cache enabled
>       - uptime 20h45m
>       - ConcurrentMarkSweep - 11494686ms
>       - ParNew - 14690885 ms
>       - time spent in GC: 35%
> prg02 - row cache disabled
>       - uptime 23h45m
>       - ConcurrentMarkSweep - 251ms
>       - ParNew - 230791 ms
>       - time spent in GC: 0.27%
>
> I would be grateful for any hints. Please let me know if you need any
> further information. For now, we are going to disable the row cache.
>
> Regards
> Jiri Horky
>


--=20
Atenciosamente,
S=E1vio S. Teles de Oliveira
voice: +55 62 9136 6996
http://br.linkedin.com/in/savioteles
Mestrando em Ci=EAncias da Computa=E7=E3o - UFG
Arquiteto de Software
Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG

--001a11c22ac072976804ea6c7aa4
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">We have the same problem.<br></div><div class=3D"gmail_ext=
ra"><br><br><div class=3D"gmail_quote">2013/11/5 Jiri Horky <span dir=3D"lt=
r">&lt;<a href=3D"mailto:horky@avast.com" target=3D"_blank">horky@avast.com=
</a>&gt;</span><br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">Hi there,<br>
<br>
we are seeing extensive memory allocation leading to quite long and<br>
frequent GC pauses when using row cache. This is on cassandra 2.0.0<br>
cluster with JNA 4.0 library with following settings:<br>
<br>
key_cache_size_in_mb: 300<br>
key_cache_save_period: 14400<br>
row_cache_size_in_mb: 1024<br>
row_cache_save_period: 14400<br>
commitlog_sync: periodic<br>
commitlog_sync_period_in_ms: 10000<br>
commitlog_segment_size_in_mb: 32<br>
<br>
-XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=3D42 -Xms10G -Xmx10G<br>
-Xmn1024M -XX:+HeapDumpOnOutOfMemoryError<br>
-XX:HeapDumpPath=3D/data2/cassandra-work/instance-1/cassandra-1383566283-pi=
d1893.hprof<br>
-Xss180k -XX:+UseParNewGC -XX:+UseConcMarkSweepGC<br>
-XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=3D8<br>
-XX:MaxTenuringThreshold=3D1 -XX:CMSInitiatingOccupancyFraction=3D75<br>
-XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:+UseCondCardMark<br>
<br>
We have disabled row cache on one node to see =A0the =A0difference. Please<=
br>
see attached plots from visual VM, I think that the effect is quite<br>
visible. I have also taken 10x &quot;jmap -histo&quot; after 5s on a affect=
ed<br>
server and plotted the result, attached as well.<br>
<br>
I have taken a dump of the application when the heap size was 10GB, most<br=
>
of the memory was unreachable, which was expected. The majority was used<br=
>
by 55-59M objects of HeapByteBuffer, byte[] and<br>
org.apache.cassandra.db.Column classes. I also include a list of inbound<br=
>
references to the HeapByteBuffer objects from which it should be visible<br=
>
where they are being allocated. This was acquired using Eclipse MAT.<br>
<br>
Here is the comparison of GC times when row cache enabled and disabled:<br>
<br>
prg01 - row cache enabled<br>
=A0 =A0 =A0 - uptime 20h45m<br>
=A0 =A0 =A0 - ConcurrentMarkSweep - 11494686ms<br>
=A0 =A0 =A0 - ParNew - 14690885 ms<br>
=A0 =A0 =A0 - time spent in GC: 35%<br>
prg02 - row cache disabled<br>
=A0 =A0 =A0 - uptime 23h45m<br>
=A0 =A0 =A0 - ConcurrentMarkSweep - 251ms<br>
=A0 =A0 =A0 - ParNew - 230791 ms<br>
=A0 =A0 =A0 - time spent in GC: 0.27%<br>
<br>
I would be grateful for any hints. Please let me know if you need any<br>
further information. For now, we are going to disable the row cache.<br>
<br>
Regards<br>
<span class=3D"HOEnZb"><font color=3D"#888888">Jiri Horky<br>
</font></span></blockquote></div><br><br clear=3D"all"><br>-- <br><font col=
or=3D"#888888">Atenciosamente,<br>S=E1vio S. Teles de Oliveira<br><div>voic=
e: +55 62 9136 6996<br><a href=3D"http://br.linkedin.com/in/savioteles" tar=
get=3D"_blank">http://br.linkedin.com/in/savioteles</a><br>
<div>

Mestrando em Ci=EAncias da Computa=E7=E3o - UFG <br>Arquiteto de Software<b=
r></div><div><div>Laboratory for Ubiquitous and Pervasive Applications (LUP=
A) - UFG</div></div></div></font>
</div>

--001a11c22ac072976804ea6c7aa4--