cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Cassandra Wiki] Update of "FAQ" by JonathanEllis
Date Tue, 30 Aug 2011 15:16:01 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "FAQ" page has been changed by JonathanEllis:
http://wiki.apache.org/cassandra/FAQ?action=diff&rev1=134&rev2=135

  == Why aren't range slices/sequential scans giving me the expected results? ==
  You're probably using the RandomPartitioner.  This is the default because it avoids hotspots,
but it means your rows are ordered by the md5 of the row key rather than lexicographically
by the raw key bytes.
  
- You '''can''' start out with a start key and end key of [empty] and use the row count argument
instead, if your goal is paging the rows.  To get the next page, start from the last key you
got in the previous page.
+ You '''can''' start out with a start key and end key of [empty] and use the row count argument
instead, if your goal is paging the rows.  To get the next page, start from the last key you
got in the previous page. This is what the Cassandra Hadoop RecordReader does, for instance.
  
  You can also use intra-row ordering of column names to get ordered results '''within'''
a row; with appropriate row 'bucketing,' you often don't need the rows themselves to be ordered.
  
@@ -443, +443 @@

  <<Anchor(seed)>>
  
  == What are seeds? ==
- 
  Seeds are used during startup to discover the cluster
  
- If you configure your nodes to refer some node as seed, nodes in your ring tend to send
Gossip message to seeds more often ( Refer to [[ArchitectureGossip]] for details ) than to
non-seeds. In other words, seeds are worked as hubs of Gossip network. With seeds, each node
can detect status changes of other nodes quickly.
+ If you configure your nodes to refer some node as seed, nodes in your ring tend to send
Gossip message to seeds more often ( Refer to ArchitectureGossip for details ) than to non-seeds.
In other words, seeds are worked as hubs of Gossip network. With seeds, each node can detect
status changes of other nodes quickly.
  
  Seeds are also referred by new nodes on bootstrap to learn other nodes in ring. When you
add a new node to ring, you need to specify at least one live seed to contact. Once a node
join the ring, it learns about the other nodes, so it doesn't need seed on subsequent boot.
  
@@ -457, +456 @@

  Seeds do not auto bootstrap (ie if a node has itself in its seed list it will not automatically
transfer data to itself) If you want a node to do that bootstrap it first and then add it
to seeds later. If you have no data (new install) you do not have to worry about bootstrap
or autobootstrap at all.
  
  Recommended usage of seeds:
+ 
-  * pick two (or more) nodes per data center as seed nodes. 
+  * pick two (or more) nodes per data center as seed nodes.
-  * sync the seed list to all your nodes 
+  * sync the seed list to all your nodes
  
  <<Anchor(seed_spof)>>
  
  == Does single seed mean single point of failure? ==
+ If you are using replicated CF on the ring, only one seed in the ring doesn't mean single
point of failure. The ring can operate or boot without the seed. However, it will need more
time to spread status changes of node over the ring. It is recommended to have multiple seeds
in production system.
- 
- If you are using replicated CF on the ring, only one seed in the ring
- doesn't mean single point of failure. The ring can operate or boot
- without the seed. However, it will need more time to spread status changes of node over
the ring.
- It is recommended to have multiple seeds in production system.
  
  <<Anchor(jconsole_array_arg)>>
  
  == Why can't I call jmx method X on jconsole? (ex. getNaturalEndpoints) ==
- 
- Some of JMX operations can't be called with jconsole because the buttons are inactive for
them. Jconsole doesn't support array argument, so operations which need array as arugument
can't be invoked on jconsole.
+ Some of JMX operations can't be called with jconsole because the buttons are inactive for
them. Jconsole doesn't support array argument, so operations which need array as arugument
can't be invoked on jconsole. You need to write a JMX client to call such operations or need
array capable JMX monitoring tool.
- You need to write a JMX client to call such operations or need array capable JMX monitoring
tool.
  
  <<Anchor(max_key_size)>>
  
  == What's the maximum key size permitted? ==
- 
  The key (and column names) must be under 64K bytes.
  
  Routing is O(N) of the key size and querying and updating are O(N log N). In practice these
factors are usually dwarfed by other overhead, but some users with very large "natural" keys
use their hashes instead to cut down the size.
  
+ <<Anchor(ubuntu_ec2_hangs)>> <<Anchor(ubuntu_hangs)>>
- <<Anchor(ubuntu_ec2_hangs)>>
- <<Anchor(ubuntu_hangs)>>
  
  == I'm using Ubuntu with JNA, and holy crap weird things keep hanging and stalling and blocking
and printing scary tracebacks in dmesg! ==
- 
  We have come across several different, but similar, sets of symptoms that might match what
you're seeing. They might all have the same root cause; it's not clear. One common piece is
messages like this in dmesg:
  
  {{{
  INFO: task (some_taskname):(some_pid) blocked for more than 120 seconds.
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  }}}
- 
  It does not seem that anyone has had the time to track this down to the real root cause,
but it does seem that upgrading the linux-image package and rebooting your instances fixes
it. There is likely some bug in several of the kernel builds distributed by Ubuntu which is
fixed in later versions. Versions of linux-image-* which are known not to have this problem
include:
  
   * linux-image-2.6.38-10-virtual (2.6.38-10.46) (Ubuntu 11.04/Natty Narwhal)
@@ -506, +496 @@

  If you have more information on the problem and better ways to avoid it, please do update
this space.
  
  <<Anchor(schema_disagreement)>>
+ 
  == What are schema disagreement errors and how do I fix them? ==
- 
  Cassandra schema updates [[LiveSchemaUpdates|assume that schema changes are done one-at-a-time]].
 If you make multiple changes at the same time, you can cause some nodes to end up with a
different schema, than others.  (Before 0.7.6, this can also be caused by cluster system clocks
being substantially out of sync with each other.)
  
  To fix schema disagreements, you need to force the disagreeing nodes to rebuild their schema.
 Here's how:
@@ -519, +509 @@

  Cluster Information:
     Snitch: org.apache.cassandra.locator.SimpleSnitch
     Partitioner: org.apache.cassandra.dht.RandomPartitioner
-    Schema versions: 
+    Schema versions:
  75eece10-bf48-11e0-0000-4d205df954a7: [192.168.1.9, 192.168.1.25]
  5a54ebd0-bd90-11e0-0000-9510c23fceff: [192.168.1.27]
  }}}
- 
  Note which schemas are in the minority and mark down those IPs -- in the above example,
192.168.1.27. Login to each of those machines and stop the Cassandra service/process by running
'sudo service cassandra stop' or 'kill <pid>'. Remove the schema* and migration* sstables
inside of your system keyspace (/var/lib/cassandra/data/system, if you're using the defaults).
  
  After starting Cassandra again, this node will notice the missing information and pull in
the correct schema from one of the other nodes.
@@ -531, +520 @@

  To confirm everything is on the same schema, verify that 'describe cluster;' only returns
one schema version.
  
  <<Anchor(dropped_messages)>>
+ 
  == Why do I see "... messages dropped.." in the logs? ==
- 
- Internode messages which are received by a node, but do not get not to be processed within
rpc_timeout are dropped rather than processed. As the coordinator node will no longer be waiting
for a response. If the Coordinator node does not receive Consistency Level responses before
the rpc_timeout it will return a !TimedOutExcpetion to the client. If the coordinator receives
Consistency Level responses it will return success to the client. 
+ Internode messages which are received by a node, but do not get not to be processed within
rpc_timeout are dropped rather than processed. As the coordinator node will no longer be waiting
for a response. If the Coordinator node does not receive Consistency Level responses before
the rpc_timeout it will return a !TimedOutExcpetion to the client. If the coordinator receives
Consistency Level responses it will return success to the client.
  
- For MUTATION messages this means that the mutation was not applied to all replicas it was
sent to. The inconsistency will be repaired by Read Repair or Anti Entropy Repair. 
+ For MUTATION messages this means that the mutation was not applied to all replicas it was
sent to. The inconsistency will be repaired by Read Repair or Anti Entropy Repair.
  
- For READ messages this means a read request may not have completed. 
+ For READ messages this means a read request may not have completed.
  
- Load shedding is part of the Cassandra architecture, if this is a persistent issue it is
generally a sign of an overloaded node or cluster. 
+ Load shedding is part of the Cassandra architecture, if this is a persistent issue it is
generally a sign of an overloaded node or cluster.
  
  <<Anchor(cli_keys)>>
+ 
  == Why does the 0.8 cli not assume keys are strings anymore? ==
- 
  Prior to 0.8, there was no type metadata available for row keys, and the cli interface treated
all keys as strings.  This made the cli unusable for the many applications whose rows were
numberic, uuids, or other non-string data.
  
  0.8 added key_validation_class to the !ColumnFamily definition, similarly to the existing
comparator for column names, and column_metadata validation_class for column values.  This
both lets clients know the expected data type, and rejects updates with non-conformant values.

Mime
View raw message