Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of jamesgolick@gmail.com
 designates 209.85.223.179 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type;
        b=t32oHtdixwXozWPhyn7GKSVMLq1wSR4jVAd8/MgHXtbWS/YIZMzeUivguNOKSAP+4X
         ISR8jWXK7uEtTdrRKYFD9ReAGhcp1u0LnZoeNrHmhgQMHMNMMrk7+BpbpOCbooPNV8q1
         IIYfwMmvIE8sCS9VToAvgi6uYMXvAwbs6iBZQ=
MIME-Version: 1.0
In-Reply-To: <m2w1ab2da821005021101r2e134c51q2306d5e04b6b7358@mail.gmail.com>
References: <h2k1ab2da821005021100l5302df57wa621fb2872e56357@mail.gmail.com>
	 <m2w1ab2da821005021101r2e134c51q2306d5e04b6b7358@mail.gmail.com>
Date: Sun, 2 May 2010 19:49:34 -0700
Message-ID: <m2t1ab2da821005021949j4b763e77gd60264e900648d7f@mail.gmail.com>
Subject: Re: Row slice / cache performance
From: James Golick <jamesgolick@gmail.com>
To: cassandra-user@incubator.apache.org
Content-Type: multipart/alternative; boundary=0016367b6388a1c14d0485a7a427

--0016367b6388a1c14d0485a7a427
Content-Type: text/plain; charset=ISO-8859-1

Just an update on this. I wrote a patch which attempts to solve this problem
by keeping an index of columns that are marked for deletion to avoid having
to iterate over the whole column set and call columns_.get() over and over
again.

My patch works, and the time spent in removeDeleted() is now close to zero.
But, the performance doesn't seem to have noticeably improved. So, I'm not
sure what I'm missing here. Either my test methodology is broken or I
completely misread the profile.

On Sun, May 2, 2010 at 11:01 AM, James Golick <jamesgolick@gmail.com> wrote:

> Not sure why the first paragraph turned in to a numbered bullet...
>
>
> On Sun, May 2, 2010 at 11:00 AM, James Golick <jamesgolick@gmail.com>wrote:
>
>>
>>    1. I wrote the list a while back about less-than-great performance
>>    when reading thousands of columns even on cache hits. Last night, I decided
>>    to try to get to the bottom of why.
>>
>>
>> I tested this by setting the row cache capacity on a TimeUUIDType-sorted
>> CF to 10, filling up a single row with 2000 columns, and only running
>> queries against that row. That row was the only thing in the database. I rm
>> -Rf'd the data before starting the test.
>>
>> The tests were done from Coda Hale's scala client cassie, which is just a
>> thin layer around the java thrift bindings. I didn't actually time each call
>> because that wasn't the objective, but I didn't really need to. Reads of 10
>> columns felt quick enough, but 100 columns was slower. 1000 columns would
>> frequently cause the client to timeout. The cache hit rate on that CF was
>> 1.0, so, yes, the row was in cache.
>>
>> Doing a thousand reads with count=100 in a single thread pegged my
>> macbook's CPU and caused the fans to spin up pretty loud.
>>
>> So, I attached a profiler and repeated the test. I'm no expert on
>> cassandra internals, so please let me know if I'm way off here. The profiled
>> reads were reversed=true, count=100.
>>
>> As far as I can tell, there are three components taking up most of the
>> time on this type of read (row slice out of cache):
>>
>>    1. ColumnFamilystore.removeDeleted() @ ~40% - Most of the time in here
>>    is actually spent materializing UUID objects so that they can be compared in
>>    the ConcurrentSkipListMap (ColumnFamily.columns_).
>>    2. SliceQueryFilter.getMemColumnIterator @ ~30% - Virtually all the
>>    time in here is spent in ConcurrentSkipListMap$Values.toArrray()
>>    3. QueryFilter.collectCollatedColumns @ ~30% - All the time being
>>    spent in ColumnFamily.addColumn, and about half of the total spent
>>    materializing UUIDs for comparison.
>>
>> This profile is consistent with the decrease in performance with higher
>> values for count. If there are more UUIDs to deserialize, the performance of
>> removeDeleted(), and collectCollatedColumns() should increase (roughly)
>> linearly.
>>
>> So, my question at this point is how to fix it. I have some basic ideas,
>> but being new to cassandra internals, I'm not sure they make any sense. Help
>> me out here:
>>
>>    1. Optionally call removeDeleted() less often. I realize that this is
>>    probably a bad idea for a lot of reasons, but it was the first thing I
>>    thought of.
>>    2. When a ColumnFamily object is put in to the row cache, copy the
>>    columns over to another data structure that doesn't need to be sorted on
>>    get(). If columns_ needs to be kept around, this option would have a memory
>>    impact, but at least for us, it'd be well worth it for the speed.
>>    3. ????
>>
>> I'd love to hear feedback on these / the rest of this (long) post.
>>
>
>

--0016367b6388a1c14d0485a7a427
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Just an update on this. I wrote a patch which attempts to solve this proble=
m by keeping an index of columns that are marked for deletion to avoid havi=
ng to iterate over the whole column set and call columns_.get() over and ov=
er again.<div>
<br></div><div>My patch works, and the time spent in removeDeleted() is now=
 close to zero. But, the performance doesn&#39;t seem to have noticeably im=
proved. So, I&#39;m not sure what I&#39;m missing here. Either my test meth=
odology is broken or I completely misread the profile.</div>
<div><br><div class=3D"gmail_quote">On Sun, May 2, 2010 at 11:01 AM, James =
Golick <span dir=3D"ltr">&lt;<a href=3D"mailto:jamesgolick@gmail.com">james=
golick@gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote"=
 style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
Not sure why the first paragraph turned in to a numbered bullet...<div><div=
></div><div class=3D"h5"><br><br><div class=3D"gmail_quote">On Sun, May 2, =
2010 at 11:00 AM, James Golick <span dir=3D"ltr">&lt;<a href=3D"mailto:jame=
sgolick@gmail.com" target=3D"_blank">jamesgolick@gmail.com</a>&gt;</span> w=
rote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><ol><li>I wrote the list a while back about =
less-than-great performance when reading thousands of columns even on cache=
 hits. Last night, I decided to try to get to the bottom of why.</li>

</ol><div><br></div><div>I tested this by setting the row cache capacity on=
 a TimeUUIDType-sorted CF to 10, filling up a single row with 2000 columns,=
 and only running queries against that row. That row was the only thing in =
the database. I rm -Rf&#39;d the data before starting the test.</div>


<div><br></div><div>The tests were done from Coda Hale&#39;s scala client c=
assie, which is just a thin layer around the java thrift bindings. I didn&#=
39;t actually time each call because that wasn&#39;t the objective, but I d=
idn&#39;t really need to. Reads of 10 columns felt quick enough, but 100 co=
lumns was slower. 1000 columns would frequently cause the client to timeout=
. The cache hit rate on that CF was 1.0, so, yes, the row was in cache.</di=
v>


<div><br></div><div>Doing a thousand reads with count=3D100 in a single thr=
ead pegged my macbook&#39;s CPU and caused the fans to spin up pretty loud.=
</div><div><br></div><div>So, I attached a profiler and repeated the test. =
I&#39;m no expert on cassandra internals, so please let me know if I&#39;m =
way off here. The profiled reads were reversed=3Dtrue, count=3D100.</div>


<div><br></div><div>As far as I can tell, there are three components taking=
 up most of the time on this type of read (row slice out of cache):</div><d=
iv><div><ol><li>ColumnFamilystore.removeDeleted() @ ~40% - Most of the time=
 in here is actually spent materializing UUID objects so that they can be c=
ompared in the ConcurrentSkipListMap (ColumnFamily.columns_).</li>


<li>SliceQueryFilter.getMemColumnIterator @ ~30% - Virtually all the time i=
n here is spent in ConcurrentSkipListMap$Values.toArrray()</li><li>QueryFil=
ter.collectCollatedColumns @ ~30% - All the time being spent in ColumnFamil=
y.addColumn, and about half of the total spent materializing UUIDs for comp=
arison.</li>


</ol><div>This profile is consistent with the decrease in performance with =
higher values for count. If there are more UUIDs to deserialize, the perfor=
mance of removeDeleted(), and collectCollatedColumns() should increase (rou=
ghly) linearly.</div>


<div><br></div><div>So, my question at this point is how to fix it. I have =
some basic ideas, but being new to cassandra internals, I&#39;m not sure th=
ey make any sense. Help me out here:</div><div><ol><li>Optionally call remo=
veDeleted() less often. I realize that this is probably a bad idea for a lo=
t of reasons, but it was the first thing I thought of.</li>


<li>When a ColumnFamily object is put in to the row cache, copy the columns=
 over to another data structure that doesn&#39;t need to be sorted on get()=
. If columns_ needs to be kept around, this option would have a memory impa=
ct, but at least for us, it&#39;d be well worth it for the speed.</li>


<li>????</li></ol><div>I&#39;d love to hear feedback on these / the rest of=
 this (long) post.</div></div></div></div>
</blockquote></div><br>
</div></div></blockquote></div><br></div>

--0016367b6388a1c14d0485a7a427--