this is expected behaviour. Either increase RF or do a nodetool decommission on a node to remove it from the ring.

On Tue, Aug 2, 2011 at 3:22 PM, Patrik Modesto <patrik.modesto@gmail.com> wrote:
Hi all!

I've a test cluster of 4 nodes running cassandra 0.7.8, with one
keyspace with RF=1, each node owns 25% of the data. As long as all
nodes are alive, there is no problem, but when I shut down just one
node I get UnavailableException in my application. cassandra-cli
returns "null" and hadoop mapreduce task won't start at all.

Loosing one node is not a problem for me, the data are not important,
loosing even half the cluster is not a problem as long as everything
runs just as with a full cluster.

The error from hadoop is like this:
Exception in thread "main" java.io.IOException: Could not get input splits
       at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:120)
       at cz.xxx.yyy.zzz.DelegatingInputFormat.getSplits(DelegatingInputFormat.java:111)
       at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:944)
       at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:961)
       at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
       at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880)
       at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
       at java.security.AccessController.doPrivileged(Native Method)
       at javax.security.auth.Subject.doAs(Subject.java:396)
       at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
       at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
       at org.apache.hadoop.mapreduce.Job.submit(Job.java:476)
       at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506)
       at cz.xxx.yyy.zzz.ContextIndexer.run(ContextIndexer.java:663)
       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
       at cz.xxx.yyy.zzz.ContextIndexer.main(ContextIndexer.java:94)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
       at java.lang.reflect.Method.invoke(Method.java:597)
       at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
Caused by: java.util.concurrent.ExecutionException:
java.io.IOException: failed connecting to all endpoints 10.0.18.87
       at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
       at java.util.concurrent.FutureTask.get(FutureTask.java:83)
       at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:116)
       ... 20 more
Caused by: java.io.IOException: failed connecting to all endpoints 10.0.18.87
       at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:197)
       at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:67)
       at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:153)
       at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:138)
       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
       at java.util.concurrent.FutureTask.run(FutureTask.java:138)
       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
       at java.lang.Thread.run(Thread.java:662)



--
---------------------------------------------
Paul Loy
paul@keteracel.com
http://uk.linkedin.com/in/paulloy