cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: get_slice failing if replication factor > running nodes
Date Sat, 13 Aug 2011 03:00:12 GMT
It's pretty easy to fix. For background read this page on the wiki http://wiki.apache.org/cassandra/Operations
also have a look at the help in the CLI, just type "help;"

If you want to reduce your RF use an "update keyspace" statement and change the RF down to
1. 

You can increase the RF as well, see the wiki page linked above. 


> what's the best practice to handle this kind of thing gracefully in production if we
DID have two nodes and one needs to be taken offline (or crashes).
There is a subtle difference here. The error below is because there is not enough endpoints
in the cluster, either UP or DOWN. If you had 2 nodes and RF 2 and one node went DOWN you
would to get the error. So long as it is known to the cluster, it can stay DOWN.

In that situation you would only be able to operate at CL ONE or ANY, QUORUM and ALL are both
2 when RF is 2. 

I would just start with RF 1 and makes changes as needed. One note, you need RF of at least
3 to be able to work at QUORUM and handle at least 1 node failure. 

Cheers
 

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 13 Aug 2011, at 05:33, ian douglas wrote:

> I'm testing a Cassandra 0.8.1 setup with SimpleCassie for PHP, and my get_slice is failing
because when I created the keyspace I set it up like this:
> 
> create keyspace armorgames with strategy_options=[{replication_factor:2}] and placement_strategy
= 'org.apache.cassandra.locator.SimpleStrategy';
> 
> ... in expectation that when this goes into production, we'll have a second node.
> 
> However, in the meantime, I'm just running one node, but now when I run a get_slice,
i get the following error:
> 
> 
> ERROR [pool-2-thread-16] 2011-08-12 10:26:39,797 Cassandra.java (line 3041) Internal
error processing get_slice
> java.lang.IllegalStateException: replication factor (2) exceeds number of endpoints (1)
>         at org.apache.cassandra.locator.SimpleStrategy.calculateNaturalEndpoints(SimpleStrategy.java:61)
>         at org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:100)
>         at org.apache.cassandra.service.StorageService.getLiveNaturalEndpoints(StorageService.java:1642)
>         at org.apache.cassandra.service.StorageService.getLiveNaturalEndpoints(StorageService.java:1636)
>         at org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:511)
>         at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:480)
>         at org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:109)
>         at org.apache.cassandra.thrift.CassandraServer.getSlice(CassandraServer.java:263)
>         at org.apache.cassandra.thrift.CassandraServer.multigetSliceInternal(CassandraServer.java:345)
>         at org.apache.cassandra.thrift.CassandraServer.get_slice(CassandraServer.java:306)
>         at org.apache.cassandra.thrift.Cassandra$Processor$get_slice.process(Cassandra.java:3033)
>         at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
>         at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         at java.lang.Thread.run(Thread.java:679)
> 
> 
> So my first question is how can I get around this, secondly, can I 'alter' a keyspace
to change its replication factor on the fly, and what impact does that have, and third, what's
the best practice to handle this kind of thing gracefully in production if we DID have two
nodes and one needs to be taken offline (or crashes).
> 
> Thanks,
> Ian
> 


Mime
View raw message