cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From benjamin roth <brs...@gmail.com>
Subject Re: How can I scale my read rate?
Date Wed, 15 Mar 2017 06:34:18 GMT
Maybe lower the compression chunk size and give the LCS a try

Am 15.03.2017 1:22 vorm. schrieb "D. Salvatore" <dd.salvatore@gmail.com>:

> Hi,
> What is roughly the average size of your rows? Are you asking all the row
> or just some fields?
> Did you check the disk utilisation?
>
> On my testbed composed by 4 VMs, each with 8 CPUs and 4GB of RAM, I
> roughly get 14500 read/sec.
>
> Salvatore
>
> 2017-03-14 22:45 GMT+00:00 S G <sg.online.email@gmail.com>:
>
>> Hi,
>>
>> I am trying to scale my read throughput.
>>
>> I have a 12 node Cassandra cluster, each node has:
>> RAM: 5.5gb
>> Disk: 64gb
>> C* version: 3.3
>> Java: 1.8.0_51
>>
>> The cluster has 2 tables - each with 2 million rows.
>> Partition keys for all these rows are unique.
>>
>> I am stress-reading it from a 15-node client cluster.
>> Each read-client has 40 threads, so total 600 read-threads from 12
>> machines.
>> Each read query is get-by-primary-key where keys are read randomly from a
>> file having all primary-keys.
>>
>>
>> I am able to get only 15,000 reads/second from the entire system.
>> Is this the best read performance I can expect?
>> Are there any benchmarks I can compare against?
>>
>>
>> I tried setting bloom_filter_fp_chance from 0.01 to 0.0001 and caching to
>> {'keys': 'ALL', 'rows_per_partition': 'ALL'} but it had no effect.
>>
>> I have also captured few JMX metric graphs as follows:
>>
>> ‚Äč
>> Garbage collection metrics did not look very interesting, hence I have
>> not added the same.
>>
>> Any ideas what more I can do to debug the system and increase its read
>> throughput?
>>
>> Thanks
>> SG
>>
>>
>>
>> On Sat, Mar 11, 2017 at 2:33 PM, Jeff Jirsa <jjirsa@gmail.com> wrote:
>>
>>> 5.5G of ram isn't very much for a jvm based database. Having fewer
>>> instances with more ram will likely give you better performance.
>>>
>>> Also, this is probably a better question for the user list than the Dev
>>> list
>>>
>>> --
>>> Jeff Jirsa
>>>
>>>
>>> > On Mar 11, 2017, at 1:20 PM, S G <sg.online.email@gmail.com> wrote:
>>> >
>>> > Hi,
>>> >
>>> > I have a 9 node cassandra cluster, each node has:
>>> > RAM: 5.5gb
>>> > Disk: 64gb
>>> > C* version: 3.3
>>> > Java: 1.8.0_51
>>> >
>>> > The cluster stores about 2 million rows and partition keys for all
>>> these
>>> > rows are unique.
>>> >
>>> > I am stress-reading it from a 12-node client cluster.
>>> > Each read-client has 50 threads, so total 600 read-threads from 12
>>> machines.
>>> > Each read query is get-by-primary-key where keys are read randomly
>>> from a
>>> > file having all primary-keys.
>>> >
>>> >
>>> > I am able to get only 15,000 reads/second from the entire system.
>>> > Is this the best read performance I can expect?
>>> > Are there any benchmarks I can compare against?
>>> >
>>> >
>>> > I tried setting bloom_filter_fp_chance from 0.01 to 0.0001 and caching
>>> to
>>> > {'keys': 'ALL', 'rows_per_partition': 'ALL'} but it had no effect.
>>> >
>>> > Thx,
>>> > SG
>>>
>>
>>
>

Mime
View raw message