cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Cassandra Wiki] Update of "Operations" by EricEvans
Date Fri, 22 Jan 2010 20:55:20 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "Operations" page has been changed by EricEvans.
http://wiki.apache.org/cassandra/Operations?action=diff&rev1=24&rev2=25

--------------------------------------------------

  
   1. You should wait long enough for all the nodes in your cluster to become aware of the
bootstrapping node via gossip before starting another bootstrap.  For most clusters 30s will
be plenty of time.
   1. Automatically picking a Token only allows doubling your cluster size at once; for more
than that, let the first group finish before starting another.
-  1. As a safety measure, Cassandra does not automatically remove data from nodes that "lose"
part of their Token Range to a newly added node.  Run "nodeprobe cleanup" on the source node(s)
when you are satisfied the new node is up and working. If you do not do this the old data
will still be counted against the load on that node and future bootstrap attempts at choosing
a location will be thrown off.
+  1. As a safety measure, Cassandra does not automatically remove data from nodes that "lose"
part of their Token Range to a newly added node.  Run "nodetool cleanup" on the source node(s)
when you are satisfied the new node is up and working. If you do not do this the old data
will still be counted against the load on that node and future bootstrap attempts at choosing
a location will be thrown off.
  
  Cassandra is smart enough to transfer data from the nearest source node(s), if your !EndpointSnitch
is configured correctly.  So, the new node doesn't need to be in the same datacenter as the
primary replica for the Range it is bootstrapping into, as long as another replica is in the
datacenter with the new one.
  
  == Removing nodes entirely ==
- You can take a node out of the cluster with `nodeprobe decommission` to a live node, or
`nodeprobe removetoken` (to any other machine) to remove a dead one.  This will assign the
ranges the old node was responsible for to other nodes, and replicate the appropriate data
there.
+ You can take a node out of the cluster with `nodetool decommission` to a live node, or `nodetool
removetoken` (to any other machine) to remove a dead one.  This will assign the ranges the
old node was responsible for to other nodes, and replicate the appropriate data there.
  
  No data is removed automatically from the node being decommissioned, so if you want to put
the node back into service at a different token on the ring, it should be removed manually.
  
  === Moving nodes ===
- `nodeprobe move`: move the target node to to a given Token. Moving is essentially a convenience
over decommission + bootstrap.
+ `nodetool move`: move the target node to to a given Token. Moving is essentially a convenience
over decommission + bootstrap.
  
  === Load balancing ===
- `nodeprobe loadbalance`: also essentially a convenience over decommission + bootstrap, only
instead of telling the target node where to move on the ring it will choose its location based
on the same heuristic as Token selection on bootstrap.
+ `nodetool loadbalance`: also essentially a convenience over decommission + bootstrap, only
instead of telling the target node where to move on the ring it will choose its location based
on the same heuristic as Token selection on bootstrap.
  
  == Consistency ==
  Cassandra allows clients to specify the desired consistency level on reads and writes. 
(See [[API]].)  If R + W > N, where R, W, and N are respectively the read replica count,
the write replica count, and the replication factor, all client reads will see the most recent
write.  Otherwise, readers '''may''' see older versions, for periods of typically a few ms;
this is called "eventual consistency."  See http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
and http://queue.acm.org/detail.cfm?id=1466448 for more.
@@ -85, +85 @@

  Cassandra repairs data in two ways:
  
   1. Read Repair: every time a read is performed, Cassandra compares the versions at each
replica (in the background, if a low consistency was requested by the reader to minimize latency),
and the newest version is sent to any out-of-date replicas.
-  1. Anti-Entropy: when `nodeprobe repair` is run, Cassandra performs a major compaction,
computes a Merkle Tree of the data on that node, and compares it with the versions on other
replicas, to catch any out of sync data that hasn't been read recently.  This is intended
to be run infrequently (e.g., weekly) since major compaction is relatively expensive.
+  1. Anti-Entropy: when `nodetool repair` is run, Cassandra performs a major compaction,
computes a Merkle Tree of the data on that node, and compares it with the versions on other
replicas, to catch any out of sync data that hasn't been read recently.  This is intended
to be run infrequently (e.g., weekly) since major compaction is relatively expensive.
  
  === Handling failure ===
  If a node goes down and comes back up, the ordinary repair mechanisms will be adequate to
deal with any inconsistent data.  If a node goes down entirely, then you have two options:
  
-  1. (Recommended approach) Bring up the replacement node with a new IP address, and !AutoBootstrap
set to true in storage-conf.xml. This will place the replacement node in the cluster and find
the appropriate position automatically. Then the bootstrap process begins. While this process
runs, the node will not receive reads until finished. Once this process is finished on the
replacement node, run `nodeprobe removetoken` once, suppling the token of the dead node, and
`nodeprobe cleanup` on each node.
+  1. (Recommended approach) Bring up the replacement node with a new IP address, and !AutoBootstrap
set to true in storage-conf.xml. This will place the replacement node in the cluster and find
the appropriate position automatically. Then the bootstrap process begins. While this process
runs, the node will not receive reads until finished. Once this process is finished on the
replacement node, run `nodetool removetoken` once, suppling the token of the dead node, and
`nodetool cleanup` on each node.
-  * You can obtain the dead node's token by running `nodeprobe ring` on any live node, unless
there was some kind of outage, and the others came up but not the down one -- in that case,
you can retrieve the token from the live nodes' system tables.
+  * You can obtain the dead node's token by running `nodetool ring` on any live node, unless
there was some kind of outage, and the others came up but not the down one -- in that case,
you can retrieve the token from the live nodes' system tables.
  
-  1. (Alternative approach) Bring up a replacement node with the same IP and token as the
old, and run `nodeprobe repair`. Until the repair process is complete, clients reading only
from this node may get no data back.  Using a higher !ConsistencyLevel on reads will avoid
this. 
+  1. (Alternative approach) Bring up a replacement node with the same IP and token as the
old, and run `nodetool repair`. Until the repair process is complete, clients reading only
from this node may get no data back.  Using a higher !ConsistencyLevel on reads will avoid
this. 
  
- The reason why you run `nodeprobe cleanup` on all live nodes is to remove old Hinted Handoff
writes stored for the dead node.
+ The reason why you run `nodetool cleanup` on all live nodes is to remove old Hinted Handoff
writes stored for the dead node.
  
  == Backing up data ==
- Cassandra can snapshot data while online using `nodeprobe snapshot`.  You can then back
up those snapshots using any desired system, although leaving them where they are is probably
the option that makes the most sense on large clusters.
+ Cassandra can snapshot data while online using `nodetool snapshot`.  You can then back up
those snapshots using any desired system, although leaving them where they are is probably
the option that makes the most sense on large clusters.
  
- Currently, only flushed data is snapshotted (not data that only exists in the commitlog).
 Run `nodeprobe flush` first and wait for that to complete, to make sure you get '''all'''
data in the snapshot.
+ Currently, only flushed data is snapshotted (not data that only exists in the commitlog).
 Run `nodetool flush` first and wait for that to complete, to make sure you get '''all'''
data in the snapshot.
  
  To revert to a snapshot, shut down the node, clear out the old commitlog and sstables, and
move the sstables from the snapshot location to the live data directory.
  
@@ -128, +128 @@

  == Monitoring ==
  Cassandra exposes internal metrics as JMX data.  This is a common standard in the JVM world;
OpenNMS, Nagios, and Munin at least offer some level of JMX support.
  
- Running `nodeprobe cfstats` can provide an overview of each Column Family, and important
metrics to graph your cluster. Some folks prefer having to deal with non-jmx clients, there
is a JMX-to-REST bridge available at http://code.google.com/p/polarrose-jmx-rest-bridge/
+ Running `nodetool cfstats` can provide an overview of each Column Family, and important
metrics to graph your cluster. Some folks prefer having to deal with non-jmx clients, there
is a JMX-to-REST bridge available at http://code.google.com/p/polarrose-jmx-rest-bridge/
  
  Important metrics to watch on a per-Column Family basis would be: '''Read Count, Read Latency,
Write Count and Write Latency'''. '''Pending Tasks''' tell you if things are backing up. These
metrics can also be exposed using any JMX client such as `jconsole`
  
@@ -136, +136 @@

  
  If you are seeing a lot of tasks being built up, your hardware or configuration tuning is
probably the bottleneck.
  
- Running `nodeprobe tpstats` will dump all of those threads to console if you don't want
to use jconsole. Example:
+ Running `nodetool tpstats` will dump all of those threads to console if you don't want to
use jconsole. Example:
  
  {{{
  Pool Name                    Active   Pending      Completed

Mime
View raw message