cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From S G <sg.online.em...@gmail.com>
Subject Re: How can I scale my read rate?
Date Sat, 18 Mar 2017 07:00:04 GMT
I have enabled JMX but not sure what metrics to look for - they are way too
many of them.
I am using session.execute(...)


On Fri, Mar 17, 2017 at 2:07 PM, Arvydas Jonusonis <
arvydas.jonusonis@gmail.com> wrote:

> It would be interesting to see some of the driver metrics (in your stress
> test tool) - if you enable JMX, they should be exposed by default.
>
> Also, are you using session.execute(..) or session.executeAsync(..) ?
>
> Arvydas
>
> On Thu, Mar 16, 2017 at 00:44 Eric Stevens <mightye@gmail.com> wrote:
>
>> I assume your graphs are from a node on the Cassandra cluster.  You're
>> really not stressing out that cluster, load average under 1 is an extremely
>> bored cluster wishing you would give it something to work on.  I have a
>> hunch that the limiting factor might be in your read stress testing tool.
>>
>> Are you doing sequential reads where each thread waits for the previous
>> read to complete before attempting the next read?  You might be spending
>> most of your time just waiting for network round trips.  I would suggest
>> slowly ramping up the number of simultaneous outstanding reads per thread
>> until you start to get a non-trivial amount of read timeout exceptions - at
>> that point is where you've hit your maximum read throughput.  Depending on
>> what kind of disks you have there you might expect per-node load averages
>> to be in the 30+ range (maybe lower, that heap size is a bit tight).
>>
>> On Wed, Mar 15, 2017 at 4:22 PM Jonathan Haddad <jon@jonhaddad.com>
>> wrote:
>>
>> Really no need to be rude.  Why the hostility?
>>
>> On Wed, Mar 15, 2017 at 2:12 PM daemeon reiydelle <daemeonr@gmail.com>
>> wrote:
>>
>> with that brain damaged memory setting yes.
>>
>>
>> *.......*
>>
>>
>>
>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <(415)%20501-0198>London
>> (+44) (0) 20 8144 9872 <+44%2020%208144%209872>*
>>
>> On Tue, Mar 14, 2017 at 3:45 PM, S G <sg.online.email@gmail.com> wrote:
>>
>> Hi,
>>
>> I am trying to scale my read throughput.
>>
>> I have a 12 node Cassandra cluster, each node has:
>> RAM: 5.5gb
>> Disk: 64gb
>> C* version: 3.3
>> Java: 1.8.0_51
>>
>> The cluster has 2 tables - each with 2 million rows.
>> Partition keys for all these rows are unique.
>>
>> I am stress-reading it from a 15-node client cluster.
>> Each read-client has 40 threads, so total 600 read-threads from 12
>> machines.
>> Each read query is get-by-primary-key where keys are read randomly from a
>> file having all primary-keys.
>>
>>
>> I am able to get only 15,000 reads/second from the entire system.
>> Is this the best read performance I can expect?
>> Are there any benchmarks I can compare against?
>>
>>
>> I tried setting bloom_filter_fp_chance from 0.01 to 0.0001 and caching to
>> {'keys': 'ALL', 'rows_per_partition': 'ALL'} but it had no effect.
>>
>> I have also captured few JMX metric graphs as follows:
>>
>> ‚Äč
>> Garbage collection metrics did not look very interesting, hence I have
>> not added the same.
>>
>> Any ideas what more I can do to debug the system and increase its read
>> throughput?
>>
>> Thanks
>> SG
>>
>>
>>
>> On Sat, Mar 11, 2017 at 2:33 PM, Jeff Jirsa <jjirsa@gmail.com> wrote:
>>
>> 5.5G of ram isn't very much for a jvm based database. Having fewer
>> instances with more ram will likely give you better performance.
>>
>> Also, this is probably a better question for the user list than the Dev
>> list
>>
>> --
>> Jeff Jirsa
>>
>>
>> > On Mar 11, 2017, at 1:20 PM, S G <sg.online.email@gmail.com> wrote:
>> >
>> > Hi,
>> >
>> > I have a 9 node cassandra cluster, each node has:
>> > RAM: 5.5gb
>> > Disk: 64gb
>> > C* version: 3.3
>> > Java: 1.8.0_51
>> >
>> > The cluster stores about 2 million rows and partition keys for all these
>> > rows are unique.
>> >
>> > I am stress-reading it from a 12-node client cluster.
>> > Each read-client has 50 threads, so total 600 read-threads from 12
>> machines.
>> > Each read query is get-by-primary-key where keys are read randomly from
>> a
>> > file having all primary-keys.
>> >
>> >
>> > I am able to get only 15,000 reads/second from the entire system.
>> > Is this the best read performance I can expect?
>> > Are there any benchmarks I can compare against?
>> >
>> >
>> > I tried setting bloom_filter_fp_chance from 0.01 to 0.0001 and caching
>> to
>> > {'keys': 'ALL', 'rows_per_partition': 'ALL'} but it had no effect.
>> >
>> > Thx,
>> > SG
>>
>>
>>
>>

Mime
View raw message