cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From crypto five <cryptof...@gmail.com>
Subject Re: cassandra read latency help
Date Thu, 31 May 2012 21:09:10 GMT
But I think it's bad idea, since hot data will be evenly distributed
between multiple sstables and filesystem pages.

On Thu, May 31, 2012 at 1:08 PM, crypto five <cryptofive@gmail.com> wrote:

> You may also consider disabling key/row cache at all.
> 1mm rows * 400 bytes = 400MB of data, can easily be in fs cache, and you
> will access your hot keys with thousands of qps without hitting disk at all.
> Enabling compression can make situation even better.
>
>
> On Thu, May 31, 2012 at 12:01 PM, Gurpreet Singh <gurpreet.singh@gmail.com
> > wrote:
>
>> Aaron,
>> Thanks for your email. The test kinda resembles how the actual
>> application will be.
>> It is going to be a simple key-value store with 500 million keys per
>> node. The traffic will be read heavy in steady state, and there will be
>> some keys that will have a lot more traffic than others. The expected hot
>> rows are estimated to be anywhere between 500000  to 1 million keys.
>>
>> I have already populated this test system with 500 million keys,
>> compacted it all to 1 file to check the size of the bloom filter and the
>> index.
>>
>> This is how i am estimating my memory for 500 million keys. plz correct
>> me if i am wrong or if i am missing any step.
>>
>> bloom filter: 1 gig
>> index samples: Index file is 8.5 gig. I believe this index file is for
>> all keys. Index interval is 128. Hence in RAM, this would be (8.5g /
>> 128)*10 (factor for datastructure overhead) = 664 mb (lets say 1 gig)
>>
>> key cache size (3 million): 3 gigs
>> memtable_total_space_mb : 2 gigs
>>
>> This totals 7 gig.
>> my heap size is 8 gigs.
>> Is there anything else that i am missing here?
>> When i do top right now, it shows java as 96% memory, thats a concern
>> because there is no write load. Should i be looking at any other number
>> here?
>>
>> Off heap row cache: 500,000 - 750,000 ~ 3 and 5 gigs (avg row size =
>> 250-500 bytes)
>>
>> My test system has 16 gigs RAM, production system will mostly have 32
>> gigs RAM and 12 spindles instead of 6 that i am testing with.
>>
>> I changed the underneath filesystem from xfs to ext2, and i am seeing
>> better results, though not the best.
>> The cfstats latency is down to 20 ms for 35 qps read load. row cache hit
>> rate is 0.21, key cache = 0.75.
>> Measuring from the client side, i am seeing roughly 10-15 ms per key, i
>> would want even lesser though, any tips would greatly help.
>> In production,  i am hoping the row cache hit rate will be higher.
>>
>>
>> The biggest thing that is affecting my system right now is the "Invalid
>> frame size of 0" error that cassandra server seems to be printing. Its
>> causing read timeouts every minute or 2 minutes. I havent been able to
>> figure out a way to fix this one. I see someone else also reported seeing
>> this, but not sure where the problem is hector, cassandra or thrift.
>>
>> Thanks
>> Gurpreet
>>
>>
>>
>>
>>
>>
>> On Wed, May 30, 2012 at 4:38 PM, aaron morton <aaron@thelastpickle.com>wrote:
>>
>>> 80 ms per request
>>>
>>> sounds high.
>>>
>>> I'm doing some guessing here, i am guessing memory usage is the problem..
>>>
>>> * I assume you are not longer seeing excessive GC activity.
>>> * The key cache will not get used when you hit the row cache. I would
>>> disable the row cache if you have a random workload, which it looks like
>>> you do.
>>> * 500 million is a lot of keys to have on a single node. At the default
>>> index sample of every 128 keys it will have about 4 million samples, which
>>> is probably taking up a lot of memory.
>>>
>>> Is this testing a real world scenario or an abstract benchmark ? IMHO
>>> you will get more insight from testing something that resembles your
>>> application.
>>>
>>> Cheers
>>>
>>>   -----------------
>>> Aaron Morton
>>> Freelance Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>>
>>> On 26/05/2012, at 8:48 PM, Gurpreet Singh wrote:
>>>
>>> Hi Aaron,
>>> Here is the latest on this..
>>> i switched to a node with 6 disks and running some read tests, and i am
>>> seeing something weird.
>>>
>>> setup:
>>> 1 node, cassandra 1.0.9, 8 cpu, 16 gig RAM, 6 7200 rpm SATA data disks
>>> striped 512 kb, commitlog mirrored.
>>> 1 keyspace with just 1 column family
>>> random partitioner
>>> total number of keys: 500 million (the keys are just longs from 1 to 500
>>> million)
>>> avg key size: 8 bytes
>>> bloom filter size: 1 gig
>>> total disk usage: 70 gigs compacted 1 sstable
>>> mean compacted row size: 149 bytes
>>> heap size: 8 gigs
>>> keycache size: 2 million (takes around 2 gigs in RAM)
>>> rowcache size: 1 million (off-heap)
>>> memtable_total_space_mb : 2 gigs
>>>
>>> test:
>>> Trying to do 5 reads per second. Each read is a multigetslice query for
>>> just 1 key, 2 columns.
>>>
>>> observations:
>>>  row cache hit rate: 0.4
>>> key cache hit rate: 0.0 (this will increase later on as system moves to
>>> steady state)
>>> cfstats - 80 ms
>>>
>>> iostat (every 5 seconds):
>>>
>>> r/s : 400
>>> %util: 20%  (all disks are at equal utilization)
>>> await: 65-70 ms (for each disk)
>>> svctm : 2.11 ms (for each disk)
>>> r-kB/s - 35000
>>>
>>> why this is weird is because..
>>> 5 reads per second is causing a latency of 80 ms per request (according
>>> to cfstats). isnt this too high?
>>> 35 MB/s is being read from the disk. That is again very weird. This
>>> number is way too high, avg row size is just 149 bytes. Even index reads
>>> should not cause this high data being read from the disk.
>>>
>>> what i understand is that each read request translates to 2 disk
>>> accesses (because there is only 1 sstable). 1 for the index, 1 for the
>>> data. At such a low reads/second, why is the latency so high?
>>>
>>> would appreciate help debugging this issue.
>>> Thanks
>>> Gurpreet
>>>
>>>
>>> On Tue, May 22, 2012 at 2:46 AM, aaron morton <aaron@thelastpickle.com>wrote:
>>>
>>>> With
>>>>
>>>> heap size = 4 gigs
>>>>
>>>> I would check for GC activity in the logs and consider setting it to 8
>>>> given you have 16 GB.  You can also check if the IO system is saturated (
>>>> http://spyced.blogspot.co.nz/2010/01/linux-performance-basics.html)
>>>> Also take a look at nodetool cfhistogram perhaps to see how many sstables
>>>> are involved.
>>>>
>>>>
>>>> I would start by looking at the latency reported on the server, then
>>>> work back to the client….
>>>>
>>>> I may have missed it in the email but what recent latency for the CF is
>>>> reported by nodetool cfstats ? That's latency for a single request on a
>>>> single read thread. The default settings give you 32 read threads.
>>>>
>>>> If you know the latency for a single request, and you know you have 32
>>>> concurrent read threads, you can get an idea of the max throughput for a
>>>> single node. Once you get above that throughput the latency for a request
>>>> will start to include wait time.
>>>>
>>>> It's a bit more complicated, because when you request 40 rows that
>>>> turns into 40 read tasks. So if two clients send a request for 40 rows at
>>>> the same time there will be 80 read tasks to be processed by 32 threads.
>>>>
>>>> Hope that helps.
>>>>
>>>>   -----------------
>>>> Aaron Morton
>>>> Freelance Developer
>>>> @aaronmorton
>>>> http://www.thelastpickle.com
>>>>
>>>> On 20/05/2012, at 4:10 PM, Radim Kolar wrote:
>>>>
>>>> Dne 19.5.2012 0:09, Gurpreet Singh napsal(a):
>>>>
>>>> Thanks Radim.
>>>>
>>>> Radim, actually 100 reads per second is achievable even with 2 disks.
>>>>
>>>> it will become worse as rows will get fragmented.
>>>>
>>>> But achieving them with a really low avg latency per key is the issue.
>>>>
>>>>
>>>> I am wondering if anyone has played with index_interval, and how much
>>>> of a difference would it make to reads on reducing the index_interval.
>>>>
>>>> close to zero. but try it yourself too and post your findings.
>>>>
>>>>
>>>>
>>>
>>>
>>
>

Mime
View raw message