Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
MIME-Version: 1.0
In-Reply-To: 
 <CADYwe75rXyeFA=cti76SGGfTQgUifPfvKg=v1f2=Odt31bNTYA@mail.gmail.com>
References: 
 <CA+FXJmSa7XO6t5-9SxpBoL+=vDuvmNebw4cQRL=NY4Z_Zsj0DA@mail.gmail.com>
 <CAMN+MLmJ7hiVWESeChpJ3xoryErubCV95tB0JdGcCBopsTw2cQ@mail.gmail.com>
 <CA+FXJmTuGAyorROE0AmDH2Bc+oQ-FgtqvAacRCtPHxcQygor9w@mail.gmail.com>
 <CADYwe75rXyeFA=cti76SGGfTQgUifPfvKg=v1f2=Odt31bNTYA@mail.gmail.com>
From: Alprema <alprema@alprema.com>
Date: Mon, 11 May 2015 13:52:00 +0200
Message-ID: 
 <CA+FXJmRMfZ=EDdSwu2AZ2BJov1XdJkJvgvr2MKcXq6sNBXHERw@mail.gmail.com>
Subject: Re: Read performance
To: user@cassandra.apache.org
Content-Type: multipart/mixed; boundary=001a113d6cb8b5f8e30515cd0254

--001a113d6cb8b5f8e30515cd0254
Content-Type: multipart/alternative; boundary=001a113d6cb8b5f8dc0515cd0252

--001a113d6cb8b5f8dc0515cd0252
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

According to the trace log, only one was read, the compaction strategy is
size tiered.

I attached a more readable version of my trace for details.

On Mon, May 11, 2015 at 11:35 AM, Anishek Agarwal <anishek@gmail.com> wrote=
:

> how many sst tables were there?   what compaction are you using ? These
> properties define how many possible disk reads cassandra has to do to get
> all the data you need depending on which SST Tables have data for your
> partition key.
>
> On Fri, May 8, 2015 at 6:25 PM, Alprema <alprema@alprema.com> wrote:
>
>> I was planning on using a more "server-friendly" strategy anyway (by
>> parallelizing my workload on multiple metrics) but my concern here is mo=
re
>> about the raw numbers.
>>
>> According to the trace and my estimation of the data size, the read from
>> disk was done at about 30MByte/s and the transfer between the responsibl=
e
>> node and the coordinator was done at 120Mbits/s which doesn't seem right
>> given that the cluster was not busy and the network is Gbit capable.
>>
>> I know that there is some overhead, but these numbers seem odd to me, do
>> they seem normal to you ?
>>
>> On Fri, May 8, 2015 at 2:34 PM, Bryan Holladay <holladay@longsight.com>
>> wrote:
>>
>>> Try breaking it up into smaller chunks using multiple threads and token
>>> ranges. 86400 is pretty large. I found ~1000 results per query is good.
>>> This will spread the burden across all servers a little more evenly.
>>>
>>> On Thu, May 7, 2015 at 4:27 AM, Alprema <alprema@alprema.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I am writing an application that will periodically read big amounts of
>>>> data from Cassandra and I am experiencing odd performances.
>>>>
>>>> My column family is a classic time series one, with series ID and Day
>>>> as partition key and a timestamp as clustering key, the value being a
>>>> double.
>>>>
>>>> The query I run gets all the values for a given time series for a give=
n
>>>> day (so about 86400 points):
>>>>
>>>> SELECT "UtcDate", "Value"FROM "Metric_OneSec"WHERE "MetricId" =3D 1221=
5ece-6544-4fcf-a15d-4f9e9ce1567eAND "Day" =3D '2015-05-05 00:00:00+0000'LIM=
IT 86400;
>>>>
>>>>
>>>> This takes about 450ms to run and when I trace the query I see that it
>>>> takes about 110ms to read the data from disk and 224ms to send the dat=
a
>>>> from the responsible node to the coordinator (full trace in attachment=
).
>>>>
>>>> I did a quick estimation of the requested data (correct me if I'm
>>>> wrong):
>>>> 86400 * (column name + column value + timestamp + ttl)
>>>> =3D 86400 * (8 + 8 + 8 + 8?)
>>>> =3D 2.6Mb
>>>>
>>>> Let's say about 3Mb with misc. overhead, so these timings seem pretty
>>>> slow to me for a modern SSD and a 1Gb/s NIC.
>>>>
>>>> Do those timings seem normal? Am I missing something?
>>>>
>>>> Thank you,
>>>>
>>>> K=C3=A9vin
>>>>
>>>>
>>>>
>>>
>>
>

--001a113d6cb8b5f8dc0515cd0252
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">According to the trace log, only one was read, the compact=
ion strategy is size tiered.<div><br></div><div>I attached a more readable =
version of my trace for details.</div></div><div class=3D"gmail_extra"><br>=
<div class=3D"gmail_quote">On Mon, May 11, 2015 at 11:35 AM, Anishek Agarwa=
l <span dir=3D"ltr">&lt;<a href=3D"mailto:anishek@gmail.com" target=3D"_bla=
nk">anishek@gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_q=
uote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1e=
x"><div dir=3D"ltr">how many sst tables were there? =C2=A0 what compaction =
are you using ? These properties define how many possible disk reads cassan=
dra has to do to get all the data you need depending on which SST Tables ha=
ve data for your partition key.</div><div class=3D"HOEnZb"><div class=3D"h5=
"><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Fri, May 8, =
2015 at 6:25 PM, Alprema <span dir=3D"ltr">&lt;<a href=3D"mailto:alprema@al=
prema.com" target=3D"_blank">alprema@alprema.com</a>&gt;</span> wrote:<br><=
blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px=
 #ccc solid;padding-left:1ex"><div dir=3D"ltr">I was planning on using a mo=
re &quot;server-friendly&quot; strategy anyway (by parallelizing my workloa=
d on multiple metrics) but my concern here is more about the raw numbers.<d=
iv><br></div><div>According to the trace and my estimation of the data size=
, the read from disk was done at about 30MByte/s and the transfer between t=
he responsible node and the coordinator was done at 120Mbits/s which doesn&=
#39;t seem right given that the cluster was not busy and the network is Gbi=
t capable.</div><div><br></div><div>I know that there is some overhead, but=
 these numbers seem odd to me, do they seem normal to you ?</div></div><div=
><div><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Fri, May=
 8, 2015 at 2:34 PM, Bryan Holladay <span dir=3D"ltr">&lt;<a href=3D"mailto=
:holladay@longsight.com" target=3D"_blank">holladay@longsight.com</a>&gt;</=
span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8e=
x;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr">Try breakin=
g it up into smaller chunks using multiple threads and token ranges. 86400 =
is pretty large. I found ~1000 results per query is good. This will spread =
the burden across all servers a little more evenly.</div><div><div><div cla=
ss=3D"gmail_extra"><br><div class=3D"gmail_quote">On Thu, May 7, 2015 at 4:=
27 AM, Alprema <span dir=3D"ltr">&lt;<a href=3D"mailto:alprema@alprema.com"=
 target=3D"_blank">alprema@alprema.com</a>&gt;</span> wrote:<br><blockquote=
 class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc soli=
d;padding-left:1ex"><div dir=3D"ltr">Hi,<div><br></div><div>I am writing an=
 application that will periodically read big amounts of data from Cassandra=
 and I am experiencing odd performances.</div><div><br></div><div>My column=
 family is a classic time series one, with series ID and Day as partition k=
ey and a timestamp as clustering key, the value being a double.</div><div><=
br></div><div>The query I run gets all the values for a given time series f=
or a given day (so about 86400 points):</div><div><br></div><div><pre style=
=3D"color:rgb(0,0,0);font-size:medium"><span style=3D"color:rgb(128,0,0);fo=
nt-weight:bold">SELECT</span> <span style=3D"color:rgb(128,0,0)">&quot;UtcD=
ate&quot;</span><span style=3D"color:rgb(128,128,48)">,</span> <span style=
=3D"color:rgb(128,0,0)">&quot;Value&quot;</span>
<span style=3D"color:rgb(128,0,0);font-weight:bold">FROM</span> <span style=
=3D"color:rgb(128,0,0)">&quot;Metric_OneSec&quot;</span>
<span style=3D"color:rgb(128,0,0);font-weight:bold">WHERE</span> <span styl=
e=3D"color:rgb(128,0,0)">&quot;MetricId&quot;</span> <span style=3D"color:r=
gb(128,128,48)">=3D</span> <span style=3D"color:rgb(0,140,0)">12215ece-6544=
-4fcf-a15d-4f9e9ce1567e</span>
<span style=3D"color:rgb(128,0,0);font-weight:bold">AND</span> <span style=
=3D"color:rgb(128,0,0)">&quot;Day&quot;</span> <span style=3D"color:rgb(128=
,128,48)">=3D</span> <span style=3D"color:rgb(0,0,230)">&#39;2015-05-05 00:=
00:00+0000&#39;</span>
<span style=3D"color:rgb(128,0,0);font-weight:bold">LIMIT</span> <span styl=
e=3D"color:rgb(0,140,0)">86400</span><span style=3D"color:rgb(128,128,48)">=
;</span></pre></div><div><br></div><div>This takes about 450ms to run and w=
hen I trace the query I see that it takes about 110ms to read the data from=
 disk and 224ms to send the data from the responsible node to the coordinat=
or (full trace in attachment).</div><div><br></div><div>I did a quick estim=
ation of the requested data (correct me if I&#39;m wrong):</div><div>86400 =
* (column name + column value + timestamp + ttl)</div><div>=3D 86400 * (8 +=
 8 + 8 + 8?)</div><div>=3D 2.6Mb=C2=A0</div><div><br></div><div>Let&#39;s s=
ay about 3Mb with misc. overhead, so these timings seem pretty slow to me f=
or a modern SSD and a 1Gb/s NIC.</div><div><br></div><div>Do those timings =
seem normal? Am I missing something?</div><div><br></div><div>Thank you,</d=
iv><div><br></div><div>K=C3=A9vin</div><div><br></div><div><br></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>

--001a113d6cb8b5f8dc0515cd0252--

--001a113d6cb8b5f8e30515cd0254
Content-Type: text/plain; charset=US-ASCII; name="cql_trace.txt"
Content-Disposition: attachment; filename="cql_trace.txt"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_i9jtwtes0

DQogYWN0aXZpdHkgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgIHwgdGltZXN0YW1wICAgIHwgc291cmNlICAgICAgICAgfCBzb3Vy
Y2VfZWxhcHNlZA0KLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0rLS0tLS0tLS0tLS0tLS0rLS0tLS0tLS0tLS0t
LS0tLSstLS0tLS0tLS0tLS0tLS0tDQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgZXhlY3V0ZV9jcWwzX3F1ZXJ5IHwgMDk6MjU6NDUsMDI3IHwg
ICAgIG5vZGUwMSAgICAgfCAgICAgICAgICAgICAgMA0KICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgTWVzc2FnZSByZWNlaXZlZCBmcm9tIC9ub2RlMDEgICAgICAgICB8IDA5OjI1
OjQ1LDAyMSB8ICAgICBub2RlMDIgICAgIHwgICAgICAgICAgICAgMTANCiAgICAgICAgICAgICAg
ICAgICAgICAgIEV4ZWN1dGluZyBzaW5nbGUtcGFydGl0aW9uIHF1ZXJ5IG9uIE1ldHJpY19PbmVT
ZWMgfCAwOToyNTo0NSwwMjEgfCAgICAgbm9kZTAyICAgICB8ICAgICAgICAgICAgMTU2DQogICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBBY3F1aXJpbmcgc3N0YWJs
ZSByZWZlcmVuY2VzIHwgMDk6MjU6NDUsMDIxIHwgICAgIG5vZGUwMiAgICAgfCAgICAgICAgICAg
IDE2NA0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIE1lcmdp
bmcgbWVtdGFibGUgdG9tYnN0b25lcyB8IDA5OjI1OjQ1LDAyMSB8ICAgICBub2RlMDIgICAgIHwg
ICAgICAgICAgICAxNzkNCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgQmxvb20gZmls
dGVyIGFsbG93cyBza2lwcGluZyBzc3RhYmxlIDUxNTMgfCAwOToyNTo0NSwwMjEgfCAgICAgbm9k
ZTAyICAgICB8ICAgICAgICAgICAgMTk4DQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
IEJsb29tIGZpbHRlciBhbGxvd3Mgc2tpcHBpbmcgc3N0YWJsZSA1MTUyIHwgMDk6MjU6NDUsMDIx
IHwgICAgIG5vZGUwMiAgICAgfCAgICAgICAgICAgIDIwNQ0KICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICBCbG9vbSBmaWx0ZXIgYWxsb3dzIHNraXBwaW5nIHNzdGFibGUgNTE1MSB8IDA5
OjI1OjQ1LDAyMSB8ICAgICBub2RlMDIgICAgIHwgICAgICAgICAgICAyMTENCiAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgQmxvb20gZmlsdGVyIGFsbG93cyBza2lwcGluZyBzc3RhYmxl
IDUxNDYgfCAwOToyNTo0NSwwMjEgfCAgICAgbm9kZTAyICAgICB8ICAgICAgICAgICAgMjE3DQog
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgS2V5IGNhY2hlIGhpdCBm
b3Igc3N0YWJsZSA1MTI1IHwgMDk6MjU6NDUsMDIxIHwgICAgIG5vZGUwMiAgICAgfCAgICAgICAg
ICAgIDIyOA0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgU2Vla2luZyB0byBwYXJ0aXRp
b24gYmVnaW5uaW5nIGluIGRhdGEgZmlsZSB8IDA5OjI1OjQ1LDAyMSB8ICAgICBub2RlMDIgICAg
IHwgICAgICAgICAgICAyMzENCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgQmxvb20g
ZmlsdGVyIGFsbG93cyBza2lwcGluZyBzc3RhYmxlIDUwNDAgfCAwOToyNTo0NSwwMjIgfCAgICAg
bm9kZTAyICAgICB8ICAgICAgICAgICAgNDcwDQogICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgIEJsb29tIGZpbHRlciBhbGxvd3Mgc2tpcHBpbmcgc3N0YWJsZSA0OTU1IHwgMDk6MjU6NDUs
MDIyIHwgICAgIG5vZGUwMiAgICAgfCAgICAgICAgICAgIDQ3OQ0KICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICBCbG9vbSBmaWx0ZXIgYWxsb3dzIHNraXBwaW5nIHNzdGFibGUgNDYxNCB8
IDA5OjI1OjQ1LDAyMiB8ICAgICBub2RlMDIgICAgIHwgICAgICAgICAgICA0ODUNClNraXBwZWQg
MC84IG5vbi1zbGljZS1pbnRlcnNlY3Rpbmcgc3N0YWJsZXMsIGluY2x1ZGVkIDAgZHVlIHRvIHRv
bWJzdG9uZXMgfCAwOToyNTo0NSwwMjIgfCAgICAgbm9kZTAyICAgICB8ICAgICAgICAgICAgNDkx
DQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgTWVyZ2luZyBkYXRhIGZyb20gbWVtdGFi
bGVzIGFuZCAxIHNzdGFibGVzIHwgMDk6MjU6NDUsMDIyIHwgICAgIG5vZGUwMiAgICAgfCAgICAg
ICAgICAgIDQ5NQ0KIFBhcnNpbmcgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgDQogU0VMRUNUICJWYWx1ZSIgRlJPTSAiTWV0cmljX09u
ZVNlYyIgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICANCiBXSEVSRSAiTWV0cmljSWQi
ID0gMTIyMTVlY2UtNjU0NC00ZmNmLWExNWQtNGY5ZTljZTE1NjdlICAgICAgICAgICAgIA0KIEFO
RCAiRGF5IiA9ICcyMDE1LTA1LTA1IDAwOjAwOjAwKzAwMDAnICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgDQogTElNSVQgODY0MDA7ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgIHwgMDk6MjU6NDUsMDI3IHwgICAgIG5vZGUwMSAg
ICAgfCAgICAgICAgICAgICAyMw0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgUHJlcGFyaW5nIHN0YXRlbWVudCB8IDA5OjI1OjQ1LDAyNyB8ICAg
ICBub2RlMDEgICAgIHwgICAgICAgICAgICAxMTUNCiAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgIFNlbmRpbmcgbWVzc2FnZSB0byAvbm9kZTAyICAgICAgICAgfCAwOToyNTo0
NSwwMjcgfCAgICAgbm9kZTAxICAgICB8ICAgICAgICAgICAgNzk4DQogICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgIFJlYWQgODYwOTAgbGl2ZSBhbmQgMCB0b21ic3RvbmVkIGNlbGxz
IHwgMDk6MjU6NDUsMTM1IHwgICAgIG5vZGUwMiAgICAgfCAgICAgICAgIDExMzgwOQ0KICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgRW5xdWV1aW5nIHJlc3BvbnNlIHRvIC9ub2Rl
MDEgICAgICAgICB8IDA5OjI1OjQ1LDEzNSB8ICAgICBub2RlMDIgICAgIHwgICAgICAgICAxMTQw
NDYNCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIFNlbmRpbmcgbWVzc2Fn
ZSB0byAvbm9kZTAxICAgICAgICAgfCAwOToyNTo0NSwxMzUgfCAgICAgbm9kZTAyICAgICB8ICAg
ICAgICAgMTE0MTA4DQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBNZXNzYWdl
IHJlY2VpdmVkIGZyb20gL25vZGUwMiAgICAgICAgIHwgMDk6MjU6NDUsMzY1IHwgICAgIG5vZGUw
MSAgICAgfCAgICAgICAgIDMzODYxNQ0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
UHJvY2Vzc2luZyByZXNwb25zZSBmcm9tIC9ub2RlMDIgICAgICAgICB8IDA5OjI1OjQ1LDM2NSB8
ICAgICBub2RlMDEgICAgIHwgICAgICAgICAzMzg2NTQNCiAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIFJlcXVlc3QgY29tcGxldGUgfCAwOToy
NTo0NSw0NTUgfCAgICAgbm9kZTAxICAgICB8ICAgICAgICAgNDI4MTExDQogICAgIA==
--001a113d6cb8b5f8e30515cd0254--