cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Cassandra Wiki] Update of "StorageConfiguration" by gdusbabek
Date Thu, 04 Feb 2010 15:39:47 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "StorageConfiguration" page has been changed by gdusbabek.
http://wiki.apache.org/cassandra/StorageConfiguration?action=diff&rev1=9&rev2=10

--------------------------------------------------

  }}}
  '']''
  
+ ''[New in 0.6: !EndPointSnitch, !ReplicaPlacementStrategy and !ReplicationFactor became
configurable per keyspace.  Prior to that they were global settings.]''
+ === EndPointSnitch ===
+ !EndPointSnitch: Setting this to the class that implements {{{IEndPointSnitch}}} which will
see if two endpoints are in the same data center or on the same rack. Out of the box, Cassandra
provides {{{org.apache.cassandra.locator.EndPointSnitch}}}
+ 
+ {{{
+ <EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>
+ }}}
+ Note: this class will work on hosts' IPs only. There is no configuration parameter to tell
Cassandra that a node is in rack ''R'' and in datacenter ''D''. The current rules are based
on the two methods: (see [[http://svn.apache.org/viewvc/incubator/cassandra/trunk/src/java/org/apache/cassandra/locator/EndPointSnitch.java?view=markup|EndPointSnitch.java]]):
+ 
+  * isOnSameRack: Look at the IP Address of the two hosts. Compare the 3rd octet. If they
are the same then the hosts are in the same rack else different racks.
+ 
+  * isInSameDataCenter: Look at the IP Address of the two hosts. Compare the 2nd octet. If
they are the same then the hosts are in the same datacenter else different datacenter.
+ 
+ === ReplicaPlacementStrategy and ReplicationFactor ===
+ Strategy: Setting this to the class that implements {{{IReplicaPlacementStrategy}}} will
change the way the node picker works. Out of the box, Cassandra provides {{{org.apache.cassandra.locator.RackUnawareStrategy}}}
and {{{org.apache.cassandra.locator.RackAwareStrategy}}} (place one replica in a different
datacenter, and the others on different racks in the same one.)
+ 
+ {{{
+ <ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy>
+ }}}
+ Number of replicas of the data
+ 
+ {{{
+ <ReplicationFactor>1</ReplicationFactor>
+ }}}
+ 
+ === ColumnFamilies ===
  The {{{CompareWith}}} attribute tells Cassandra how to sort the columns for slicing operations.
 The default is {{{BytesType}}}, which is a straightforward lexical comparison of the bytes
in each column. Other options are {{{AsciiType}}}, {{{UTF8Type}}}, {{{LexicalUUIDType}}},
{{{TimeUUIDType}}}, and {{{LongType}}}.  You can also specify the fully-qualified class name
to a class of your choice extending {{{org.apache.cassandra.db.marshal.AbstractType}}}.
  
   * {{{SuperColumns}}} have a similar {{{CompareSubcolumnsWith}}} attribute.
@@ -89, +115 @@

  }}}
  Cassandra uses MD5 hash internally to hash the keys to place on the ring in a {{{RandomPartitioner}}}.
So it makes sense to divide the hash space equally by the number of machines available using
{{{InitialToken}}} ie, If there are 10 machines, each will handle 1/10th of maximum hash value)
and expect that the machines will get a reasonably equal load.
  
- == EndPointSnitch ==
- !EndPointSnitch: Setting this to the class that implements {{{IEndPointSnitch}}} which will
see if two endpoints are in the same data center or on the same rack. Out of the box, Cassandra
provides {{{org.apache.cassandra.locator.EndPointSnitch}}}
- 
- {{{
- <EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>
- }}}
- Note: this class will work on hosts' IPs only. There is no configuration parameter to tell
Cassandra that a node is in rack ''R'' and in datacenter ''D''. The current rules are based
on the two methods: (see [[http://svn.apache.org/viewvc/incubator/cassandra/trunk/src/java/org/apache/cassandra/locator/EndPointSnitch.java?view=markup|EndPointSnitch.java]]):
- 
-  * isOnSameRack: Look at the IP Address of the two hosts. Compare the 3rd octet. If they
are the same then the hosts are in the same rack else different racks.
- 
-  * isInSameDataCenter: Look at the IP Address of the two hosts. Compare the 2nd octet. If
they are the same then the hosts are in the same datacenter else different datacenter.
- 
- == ReplicaPlacementStrategy ==
- Strategy: Setting this to the class that implements {{{IReplicaPlacementStrategy}}} will
change the way the node picker works. Out of the box, Cassandra provides {{{org.apache.cassandra.locator.RackUnawareStrategy}}}
and {{{org.apache.cassandra.locator.RackAwareStrategy}}} (place one replica in a different
datacenter, and the others on different racks in the same one.)
- 
- {{{
- <ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy>
- }}}
- Number of replicas of the data
- 
- {{{
- <ReplicationFactor>1</ReplicationFactor>
- }}}
  == Directories ==
  Directories: Specify where Cassandra should store different data on disk.  Keep the data
disks and the {{{CommitLog}}} disks separate for best performance. See also [[FAQ#what_kind_of_hardware_should_i_use|what
kind of hardware should I use?]]
  

Mime
View raw message