cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From S G <sg.online.em...@gmail.com>
Subject Re: How can I scale my read rate?
Date Tue, 14 Mar 2017 22:45:59 GMT
Hi,

I am trying to scale my read throughput.

I have a 12 node Cassandra cluster, each node has:
RAM: 5.5gb
Disk: 64gb
C* version: 3.3
Java: 1.8.0_51

The cluster has 2 tables - each with 2 million rows.
Partition keys for all these rows are unique.

I am stress-reading it from a 15-node client cluster.
Each read-client has 40 threads, so total 600 read-threads from 12 machines.
Each read query is get-by-primary-key where keys are read randomly from a
file having all primary-keys.


I am able to get only 15,000 reads/second from the entire system.
Is this the best read performance I can expect?
Are there any benchmarks I can compare against?


I tried setting bloom_filter_fp_chance from 0.01 to 0.0001 and caching to
{'keys': 'ALL', 'rows_per_partition': 'ALL'} but it had no effect.

I have also captured few JMX metric graphs as follows:

‚Äč
Garbage collection metrics did not look very interesting, hence I have not
added the same.

Any ideas what more I can do to debug the system and increase its read
throughput?

Thanks
SG



On Sat, Mar 11, 2017 at 2:33 PM, Jeff Jirsa <jjirsa@gmail.com> wrote:

> 5.5G of ram isn't very much for a jvm based database. Having fewer
> instances with more ram will likely give you better performance.
>
> Also, this is probably a better question for the user list than the Dev
> list
>
> --
> Jeff Jirsa
>
>
> > On Mar 11, 2017, at 1:20 PM, S G <sg.online.email@gmail.com> wrote:
> >
> > Hi,
> >
> > I have a 9 node cassandra cluster, each node has:
> > RAM: 5.5gb
> > Disk: 64gb
> > C* version: 3.3
> > Java: 1.8.0_51
> >
> > The cluster stores about 2 million rows and partition keys for all these
> > rows are unique.
> >
> > I am stress-reading it from a 12-node client cluster.
> > Each read-client has 50 threads, so total 600 read-threads from 12
> machines.
> > Each read query is get-by-primary-key where keys are read randomly from a
> > file having all primary-keys.
> >
> >
> > I am able to get only 15,000 reads/second from the entire system.
> > Is this the best read performance I can expect?
> > Are there any benchmarks I can compare against?
> >
> >
> > I tried setting bloom_filter_fp_chance from 0.01 to 0.0001 and caching to
> > {'keys': 'ALL', 'rows_per_partition': 'ALL'} but it had no effect.
> >
> > Thx,
> > SG
>

Mime
View raw message