incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michal Michalski <mich...@opera.com>
Subject Re: index_interval memory savings in our case(if you are curious)Š (and performance result)...
Date Thu, 21 Mar 2013 08:53:50 GMT
Argh, now I think that row size has nothing to do with the ii-based 
index size/efficiency (I was thinking about the need of reading 
index_interval / 2 entries in average from index file before finding the 
proper one, but it should not have nothing to do with row size) - forget 
the question; need to get a second coffee ;-)

M.

W dniu 21.03.2013 09:29, Michal Michalski pisze:
> Dean, what is your row size approximately?
>
> We've been using ii = 512 for a long time because of memory issues, but
> now - as bloom filter is kept off-heap and memory is not an issue
> anymore - I've reverted it to 128 to see if this improves anything. It
> seems it doesn't (except that I have less connections resets reported by
> Munin's netstat plugin, but I'm not 100% sure if it's related to lower
> ii, as I don't really believe that disk scan delay difference with ii =
> 512 may be so huge to timeout connections), but I'm just curious how
> "far" are we from the point where it will matter to know if this might
> be an issue soon (our rows are growing in time - not very fast, but they
> do), so I'm looking for some "reference" / comparison ;-)
>
> Currently, according to cfhistograms, vast majority (~70%) of our rows'
> size is up to 20KB and the rest is up to 50KB. I wonder if it's the size
> that really matters in terms of ii value.
>
> M.
>
>
> W dniu 20.03.2013 13:54, Hiller, Dean pisze:
>> Oh, and to give you an idea of memory savings, we had a node at 10G RAM
>> usage...we had upped a few nodes to 16G from 8G as we don't have our new
>> nodes ready yet(we know we should be at 8G but we would have a dead
>> cluster if we did that).
>>
>> On startup, the initial RAM is around 6-8G.  Startup with
>> index_interval=512 resulted in a 2.5G-2.8G initial RAM and I have seen it
>> grow to 3.3G and back down to 2.8G.  We just rolled this out an hour ago.
>> Our website response time is the same as before as well.
>>
>> We rolled to only 2 nodes(out of 6) in our cluster so far to test it out
>> and let it soak a bit.  We will slowly roll to more nodes monitoring the
>> performance as we go.  Also, since dynamic snitch is not working with
>> SimpleSnitch, we know that just one slow node affects our website(from
>> personal pain/experience of nodes hitting RAM limit and slowing down
>> causing website to get real slow).
>>
>> Dean
>>
>> On 3/20/13 6:41 AM, "Andras Szerdahelyi"
>> <andras.szerdahelyi@ignitionone.com> wrote:
>>
>>> 2. Upping index_interval from 128 to 512 (this seemed to reduce our
>>> memory
>>> usage significantly!!!)
>>>
>>>
>>> I'd be very careful with that as a one-stop improvement solution for two
>>> reasons AFAIK
>>> 1) you have to rebuild stables ( not an issue if you are evaluating,
>>> doing
>>> test writes.. Etc, not so much in production )
>>> 2) it can affect reads ( number of sstable reads to serve a read )
>>> especially if your key/row cache is ineffective
>>>
>>> On 20/03/13 13:34, "Hiller, Dean" <Dean.Hiller@nrel.gov> wrote:
>>>
>>>> Also, look at the cassandra logs.  I bet you see the typicalŠblah
>>>> blah is
>>>> at 0.85, doing memory cleanup which is not exactly GC but cassandra
>>>> memory
>>>> managementŠ..and of course, you have GC on top of that.
>>>>
>>>> If you need to get your memory down, there are multiple ways
>>>> 1. Switching size tiered compaction to leveled compaction(with 1
>>>> billion
>>>> narrow rows, this helped us quite a bit)
>>>> 2. Upping index_interval from 128 to 512 (this seemed to reduce our
>>>> memory
>>>> usage significantly!!!)
>>>> 3. Just add more nodes as moving the rows to other servers reduces
>>>> memory
>>> >from #1 and #2 above since the server would have less rows
>>>>
>>>> Later,
>>>> Dean
>>>>
>>>> On 3/20/13 6:29 AM, "Andras Szerdahelyi"
>>>> <andras.szerdahelyi@ignitionone.com> wrote:
>>>>
>>>>>
>>>>> I'd say GC. Please fill in form CASS-FREEZE-001 below and get back
>>>>> to us
>>>>> :-) ( sorry )
>>>>>
>>>>> How big is your JVM heap ? How many CPUs ?
>>>>> Garbage collection taking long ? ( look for log lines from
>>>>> GCInspector)
>>>>> Running out of heap ? ( "heap is .. full" log lines )
>>>>> Any tasks backing up / being dropped ? ( nodetool tpstats and "..
>>>>> dropped
>>>>> in last .. ms" log lines )
>>>>> Are writes really slow? ( nodetool cfhistograms Keyspace
>>>>> ColumnFamily )
>>>>>
>>>>> How much is lots of data? Wide or skinny rows? Mutations/sec ?
>>>>> Which Compaction Strategy are you using? Output of show schema (
>>>>> cassandra-cli ) for the relevant Keyspace/CF might help as well
>>>>>
>>>>> What consistency are you doing your writes with ? I assume ONE or
>>>>> ANY if
>>>>> you have a single node.
>>>>>
>>>>> What are the values for these settings in cassandra.yaml
>>>>>
>>>>> memtable_total_space_in_mb:
>>>>> memtable_flush_writers:
>>>>> memtable_flush_queue_size:
>>>>> compaction_throughput_mb_per_sec:
>>>>>
>>>>> concurrent_writes:
>>>>>
>>>>>
>>>>>
>>>>> Which version of Cassandra?
>>>>>
>>>>>
>>>>>
>>>>> Regards,
>>>>> Andras
>>>>>
>>>>> From:  Joel Samuelsson <samuelsson.joel@gmail.com>
>>>>> Reply-To:  "user@cassandra.apache.org" <user@cassandra.apache.org>
>>>>> Date:  Wednesday 20 March 2013 13:06
>>>>> To:  "user@cassandra.apache.org" <user@cassandra.apache.org>
>>>>> Subject:  Cassandra freezes
>>>>>
>>>>>
>>>>> Hello,
>>>>>
>>>>> I've been trying to load test a one node cassandra cluster. When I add
>>>>> lots of data, the Cassandra node freezes for 4-5 minutes during which
>>>>> neither reads nor writes are served.
>>>>> During this time, Cassandra takes 100% of a single CPU core.
>>>>> My initial thought was that this was Cassandra flushing memtables
>>>>> to the
>>>>> disk, however, the disk i/o is very low during this time.
>>>>> Any idea what my problem could be?
>>>>> I'm running in a virtual environment in which I have no control of
>>>>> drives.
>>>>> So commit log and data directory is (probably) on the same drive.
>>>>>
>>>>> Best regards,
>>>>> Joel Samuelsson
>>>>>
>>>>
>>>


Mime
View raw message