incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <aa...@thelastpickle.com>
Subject Re: Cassandra Scaling Questions
Date Tue, 03 Aug 2010 01:38:18 GMT
I *think* people lean towards more JVM than file cache. Often people email about the JVM running
Out Of Memory, so  give it more and see how much it's using in your case. Your nodes will
gave a minimum requirement for memory based on the Memtable Thresholds, cache settings and
the usage patters. It's not like something, say, MS SQL Server when you can tell it to exist
in a certain amount of memory.

http://wiki.apache.org/cassandra/MemtableThresholds may give some background.  

Sorry it's not very clear, perhaps someone else can give a better answer´╗┐

Aaron

On 03 Aug, 2010,at 12:11 PM, Aaron Blew <aaronblew@gmail.com> wrote:

> 1.) 16 to 24GB out of how much total system memory?  Is this 50% of available system
RAM or 90%?
>
> Thanks for the reply!
> -Aaron
>
>
> On Mon, Aug 2, 2010 at 2:24 PM, Aaron Morton <aaron@thelastpickle.com> wrote:
>
>     Will answer as best I can, others will know more.
>
>     1) Most people seem to lean towards more memory for the JVM, around 16 to 24gb. Memory
is also used by the MemTables and I assume during the compaction processes.
>
>     2) Cannot say for sure, but I assume so. Think I've seen the cache with data in it
when I have only done writes.
>
>     3) I've noticed large differences between nodes when using the RP and automatic token
assignments, such as the last node with very little data. Try setting tokens at start up,
see http://wiki.apache.org/cassandra/Operations
>
>     3.5) Yes load balance restores things, I suggest you run it on one node at a time.
Start with the node with the lowest load. Watching the progress by watching the streams via
JMX or nodetool.
>
>     Hope that helps.
>     Aaron
>
>
>
>
>     On 03 Aug, 2010,at 07:28 AM, Aaron Blew <aaronblew@gmail.com> wrote:
>
>>     Hi All,
>>     I've got a couple questions that have come up about how Cassandra works and what
others are seeing in their environments.  Here goes:
>>
>>     1.) What have you found to be the best ratio of Cassandra row cache to memory
free on the system for filesystem cache?  Are you tuning it like an RDBMS so Cassandra has
the vast majority of the RAM in the system or are you letting the filesystem cache do some
of the work?
>>
>>     2.) Is the Cassandra cache write-through (ie are new records held in the row
cache as they're written to disk?
>>
>>     3.) When using the random partitioner how much difference should be expected
(or has been observed) between nodes?  2%? 10%?
>>
>>     3.5) Can a load balance be expected to bring the data distribution pretty close
to even among all nodes in the ring?  Is the correct process for a loadbalance to run the
loadbalance operation on each node in the ring?
>>
>>
>>     Thanks!  I'm curious to hear what other's have observed.
>>     -Aaron
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
    • Unnamed multipart/related (inline, None, 0 bytes)
View raw message