Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of jakers@gmail.com designates
 209.85.212.54 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAM1GQ4n6831aKWx2VTmbnESYc_7CgTvZF_wnihk+4CB9WomaFQ@mail.gmail.com>
References: 
 <CAM1GQ4mDsJq4KZ0Yp=84oJaTq8XQKfAtTG3nbQTFSuMn4-drOA@mail.gmail.com>
 <BE2C9607-2994-42CD-8E76-CA06F7C4BCF9@ecyrd.com>
 <CAM1GQ4n6831aKWx2VTmbnESYc_7CgTvZF_wnihk+4CB9WomaFQ@mail.gmail.com>
From: Jake Luciani <jakers@gmail.com>
Date: Mon, 15 Jul 2013 10:08:26 -0400
Message-ID: 
 <CALamADKGZp3qyN98HcG8rMfyt=fUTD2cnkbXK90_1hirS4uY1Q@mail.gmail.com>
Subject: Re: Why does cassandra PoolingSegmentedFile recycle the
 RandomAccessReader?
To: user <user@cassandra.apache.org>
Content-Type: multipart/alternative; boundary=bcaec51d2c401d02fa04e18d6674

--bcaec51d2c401d02fa04e18d6674
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

Take a look at https://issues.apache.org/jira/browse/CASSANDRA-5661


On Mon, Jul 15, 2013 at 4:18 AM, sulong <sulong1984@gmail.com> wrote:

> Thanks for your help. Yes, I will try to increase the sstable size. I hop=
e
> it can save me.
>
> 9000 SSTableReader x 10 RandomAccessReader x 64Kb =3D 5.6G memory. If the=
re
> is only one RandomAccessReader, the memory will be 9000 * 1 * 64Kb =3D 0.=
56G
> . Looks great. But I think it must be reasonable to recycle the
> RandomAccessReader.
>
>
> On Mon, Jul 15, 2013 at 4:02 PM, Janne Jalkanen <janne.jalkanen@ecyrd.com=
>wrote:
>
>>
>> I had exactly the same problem, so I increased the sstable size (from 5
>> to 50 MB - the default 5MB is most certainly too low for serious usecase=
s).
>>  Now the number of SSTableReader objects is manageable, and my heap is
>> happier.
>>
>> Note that for immediate effect I stopped the node, removed the *.json
>> files and restarted - which put all SSTables to L0, which meant a weeken=
d
>> full of compactions=85 Would be really cool if there was a way to
>> automatically drop all LCS SSTables one level down to make them compact
>> earlier without avoiding the
>> "OMG-must-compact-everything-aargh-my-L0-is-full" -effect of removing th=
e
>> JSON file.
>>
>> /Janne
>>
>> On 15 Jul 2013, at 10:48, sulong <sulong1984@gmail.com> wrote:
>>
>> > Why does cassandra PoolingSegmentedFile recycle the RandomAccessReader=
?
>> The RandomAccessReader objects consums too much memory.
>> >
>> > I have a cluster of 4 nodes. Every node's cassandra jvm has 8G heap.
>> The cassandra's memory is full after about one month, so I have to resta=
rt
>> the 4 nodes every month.
>> >
>> > I have 100G data on every node, with LevedCompactionStrategy and 10M
>> sstable size, so there are more than 10000 sstable files. By looking
>> through the heap dump file, I see there are more than 9000 SSTableReader
>> objects in memory, which references lots of  RandomAccessReader objects.
>> The memory is consumed by these RandomAccessReader objects.
>> >
>> > I see the PoolingSegementedFile has a recycle method, which puts the
>> RandomAccessReader to a queue. Looks like the Queue always grow until th=
e
>> sstable is compacted.  Is there any way to stop the RandomAccessReader
>> recycling? Or, set a limit to the recycled RandomAccessReader's number?
>> >
>> >
>>
>>
>


--=20
http://twitter.com/tjake

--bcaec51d2c401d02fa04e18d6674
Content-Type: text/html; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Take a look at=A0<a href=3D"https://issues.apache.org/jira=
/browse/CASSANDRA-5661">https://issues.apache.org/jira/browse/CASSANDRA-566=
1</a></div><div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On=
 Mon, Jul 15, 2013 at 4:18 AM, sulong <span dir=3D"ltr">&lt;<a href=3D"mail=
to:sulong1984@gmail.com" target=3D"_blank">sulong1984@gmail.com</a>&gt;</sp=
an> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">Thanks for your help. Yes, =
I will try to increase the sstable size. I hope it can save me. =A0<div><br=
></div>

<div>9000 SSTableReader x 10 RandomAccessReader x 64Kb =3D 5.6G memory. If =
there is only one RandomAccessReader, the memory will be 9000 * 1 * 64Kb =
=3D 0.56G . Looks great. But I think it must be reasonable to recycle the R=
andomAccessReader.=A0<br>


</div></div><div class=3D"HOEnZb"><div class=3D"h5"><div class=3D"gmail_ext=
ra"><br><br><div class=3D"gmail_quote">On Mon, Jul 15, 2013 at 4:02 PM, Jan=
ne Jalkanen <span dir=3D"ltr">&lt;<a href=3D"mailto:janne.jalkanen@ecyrd.co=
m" target=3D"_blank">janne.jalkanen@ecyrd.com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><br>
I had exactly the same problem, so I increased the sstable size (from 5 to =
50 MB - the default 5MB is most certainly too low for serious usecases). =
=A0Now the number of SSTableReader objects is manageable, and my heap is ha=
ppier.<br>


<br>
Note that for immediate effect I stopped the node, removed the *.json files=
 and restarted - which put all SSTables to L0, which meant a weekend full o=
f compactions=85 Would be really cool if there was a way to automatically d=
rop all LCS SSTables one level down to make them compact earlier without av=
oiding the &quot;OMG-must-compact-everything-aargh-my-L0-is-full&quot; -eff=
ect of removing the JSON file.<br>


<span><font color=3D"#888888"><br>
/Janne<br>
</font></span><div><div><br>
On 15 Jul 2013, at 10:48, sulong &lt;<a href=3D"mailto:sulong1984@gmail.com=
" target=3D"_blank">sulong1984@gmail.com</a>&gt; wrote:<br>
<br>
&gt; Why does cassandra PoolingSegmentedFile recycle the RandomAccessReader=
? The RandomAccessReader objects consums too much memory.<br>
&gt;<br>
&gt; I have a cluster of 4 nodes. Every node&#39;s cassandra jvm has 8G hea=
p. The cassandra&#39;s memory is full after about one month, so I have to r=
estart the 4 nodes every month.<br>
&gt;<br>
&gt; I have 100G data on every node, with LevedCompactionStrategy and 10M s=
stable size, so there are more than 10000 sstable files. By looking through=
 the heap dump file, I see there are more than 9000 SSTableReader objects i=
n memory, which references lots of =A0RandomAccessReader objects. The memor=
y is consumed by these RandomAccessReader objects.<br>


&gt;<br>
&gt; I see the PoolingSegementedFile has a recycle method, which puts the R=
andomAccessReader to a queue. Looks like the Queue always grow until the ss=
table is compacted. =A0Is there any way to stop the RandomAccessReader recy=
cling? Or, set a limit to the recycled RandomAccessReader&#39;s number?<br>


&gt;<br>
&gt;<br>
<br>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br><br clear=3D"all"><div><br></div>-- <br>=
<a href=3D"http://twitter.com/tjake" target=3D"_blank">http://twitter.com/t=
jake</a>
</div>

--bcaec51d2c401d02fa04e18d6674--