cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Cassandra Wiki] Update of "MemtableThresholds" by EricEvans
Date Fri, 15 May 2009 19:40:21 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The following page has been changed by EricEvans:
http://wiki.apache.org/cassandra/MemtableThresholds

The comment on the change is:
feels like there is still lots of room for improvement, but I'm coming up empty

------------------------------------------------------------------------------
- '''''THIS DOCUMENT IS A WORK IN PROGRESS'''''
- 
  When performing write operations, Cassandra stores values to column-family
  specific, in-memory data structures called Memtables. These Memtables are
  flushed to disk whenever one of the configurable thresholds is exceeded.
- Proper tuning of these thresholds is important since the more memory that 
+ Proper tuning of these thresholds is important in making the most of 
+ available system memory, without bringing the node down for lack of memory.
- can be put to use the better, while running out of memory is a sure way to
- bring down the node.
  
+ == Configuring Thresholds ==
- Since Memtables store actual column values, they consume at least as
+ Since Memtables are storing actual column values, they consume at least as
  much memory as the size of data inserted. However, there is also overhead 
- associated with the data-structures used to index this data. When the
+ associated with the structures used to index this data. When the
  number of columns and rows is high compared to the size of values, this
- overhead can become quite significant.
+ overhead can become quite significant, (possibly greater than the data
+ itself).
  
- == Threshold Configuration ==
+ In other words, which threshold(s) to use, and what to set them to is
+ not just a function of how much memory you have, but of how many column
+ families, how many columns per column-family, and the size of values 
+ being stored.
+ 
  Listed below are the thresholds found in `storage-conf.xml`, along with a
  description.
  
@@ -25, +28 @@

  it to be flushed to disk. It corresponds to the size of the values
  inserted, (plus the size of the containing column).
  
+ If left unconfigured (missing from the config), this defaults to 128MB.
+ 
+ ''Note: The value is applied on a per column-family basis.''
+ 
  === MemtableObjectCountInMillions ===
- If left unset, defaults to 1, (or 1,000,000 objects).
+ This directive sets a threshold on the number of columns stored. 
+ 
+ Left unconfigured (missing from the config), this defaults to 1 
+ (or 1,000,000 objects).
+ 
+ ''Note: The value is applied on a per column-family basis.''
  
  == Using Jconsole To Optimize Thresholds ==
  Cassandra's column-family mbeans have a number of attributes that can
- prove invaluable in determining optimal thresholds. Onc way to access
+ prove invaluable in determining optimal thresholds. One way to access
- this instrumentation is using Jconsole, a graphical monitoring and
+ this instrumentation is by using Jconsole, a graphical monitoring and
  management application that ships with your JDK.
  
  Launching Jconsole with no arguments will display the "New Connection"
@@ -51, +63 @@

   1. ''!MemtableDataSize'', which is used to determine the total size of stored data. This
is the sum of all the values stored and does not account for Memtable overhead, (i.e. it's
not indicative of the actual memory used by the Memtable). Use this value when adjusting [#MemtableSizeInMB
MemtableSizeInMB].
   1. Finally there is ''!MemtableSwitchCount'' which increases by one each time a column
family flushes its Memtable to disk.
  
- ''Note: You'll need to manually mash the `Refresh` button to update the values.''
+ ''Note: You'll need to manually mash the `Refresh` button to update these values.''
  
  attachment:jconsole_attributes.png
  

Mime
View raw message