cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Stevens <migh...@gmail.com>
Subject Re: How can I scale my read rate?
Date Wed, 15 Mar 2017 23:19:20 GMT
I assume your graphs are from a node on the Cassandra cluster.  You're
really not stressing out that cluster, load average under 1 is an extremely
bored cluster wishing you would give it something to work on.  I have a
hunch that the limiting factor might be in your read stress testing tool.

Are you doing sequential reads where each thread waits for the previous
read to complete before attempting the next read?  You might be spending
most of your time just waiting for network round trips.  I would suggest
slowly ramping up the number of simultaneous outstanding reads per thread
until you start to get a non-trivial amount of read timeout exceptions - at
that point is where you've hit your maximum read throughput.  Depending on
what kind of disks you have there you might expect per-node load averages
to be in the 30+ range (maybe lower, that heap size is a bit tight).

On Wed, Mar 15, 2017 at 4:22 PM Jonathan Haddad <jon@jonhaddad.com> wrote:

> Really no need to be rude.  Why the hostility?
>
> On Wed, Mar 15, 2017 at 2:12 PM daemeon reiydelle <daemeonr@gmail.com>
> wrote:
>
> with that brain damaged memory setting yes.
>
>
> *.......*
>
>
>
> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <(415)%20501-0198>London
> (+44) (0) 20 8144 9872 <+44%2020%208144%209872>*
>
> On Tue, Mar 14, 2017 at 3:45 PM, S G <sg.online.email@gmail.com> wrote:
>
> Hi,
>
> I am trying to scale my read throughput.
>
> I have a 12 node Cassandra cluster, each node has:
> RAM: 5.5gb
> Disk: 64gb
> C* version: 3.3
> Java: 1.8.0_51
>
> The cluster has 2 tables - each with 2 million rows.
> Partition keys for all these rows are unique.
>
> I am stress-reading it from a 15-node client cluster.
> Each read-client has 40 threads, so total 600 read-threads from 12
> machines.
> Each read query is get-by-primary-key where keys are read randomly from a
> file having all primary-keys.
>
>
> I am able to get only 15,000 reads/second from the entire system.
> Is this the best read performance I can expect?
> Are there any benchmarks I can compare against?
>
>
> I tried setting bloom_filter_fp_chance from 0.01 to 0.0001 and caching to
> {'keys': 'ALL', 'rows_per_partition': 'ALL'} but it had no effect.
>
> I have also captured few JMX metric graphs as follows:
>
> ‚Äč
> Garbage collection metrics did not look very interesting, hence I have not
> added the same.
>
> Any ideas what more I can do to debug the system and increase its read
> throughput?
>
> Thanks
> SG
>
>
>
> On Sat, Mar 11, 2017 at 2:33 PM, Jeff Jirsa <jjirsa@gmail.com> wrote:
>
> 5.5G of ram isn't very much for a jvm based database. Having fewer
> instances with more ram will likely give you better performance.
>
> Also, this is probably a better question for the user list than the Dev
> list
>
> --
> Jeff Jirsa
>
>
> > On Mar 11, 2017, at 1:20 PM, S G <sg.online.email@gmail.com> wrote:
> >
> > Hi,
> >
> > I have a 9 node cassandra cluster, each node has:
> > RAM: 5.5gb
> > Disk: 64gb
> > C* version: 3.3
> > Java: 1.8.0_51
> >
> > The cluster stores about 2 million rows and partition keys for all these
> > rows are unique.
> >
> > I am stress-reading it from a 12-node client cluster.
> > Each read-client has 50 threads, so total 600 read-threads from 12
> machines.
> > Each read query is get-by-primary-key where keys are read randomly from a
> > file having all primary-keys.
> >
> >
> > I am able to get only 15,000 reads/second from the entire system.
> > Is this the best read performance I can expect?
> > Are there any benchmarks I can compare against?
> >
> >
> > I tried setting bloom_filter_fp_chance from 0.01 to 0.0001 and caching to
> > {'keys': 'ALL', 'rows_per_partition': 'ALL'} but it had no effect.
> >
> > Thx,
> > SG
>
>
>
>

Mime
View raw message