incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexandru Dan Sicoe <sicoe.alexan...@googlemail.com>
Subject Re: UnavailableException with 1 node down and RF=2?
Date Fri, 28 Oct 2011 07:59:53 GMT
Hi guys,
 It's interesting to see this thread. I recently discovered a similar
problem on my 3 node Cassandra 0.8.5 cluster. It was working fine, then I
took a node down to see how it behaves. All of a sudden I couldn't write or
read because of this exception being thrown:

Exception in thread "main"
me.prettyprint.hector.api.exceptions.HUnavailableException: : May not
be enough replicas present to handle consistency level.
        at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:60)
        at me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:97)
        at me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:90)
        at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101)
        at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:232)
        at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
        at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:102)
        at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:108)
        at me.prettyprint.cassandra.model.MutatorImpl$3.doInKeyspace(MutatorImpl.java:222)
        at me.prettyprint.cassandra.model.MutatorImpl$3.doInKeyspace(MutatorImpl.java:219)
        at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
        at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
        at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:219)
        at ch.cern.pbeast.CassandraDBClient.executeBatchInsert(CassandraDBClient.java:958)
        at ch.cern.test.TimeBinTester.main(TimeBinTester.java:294)Caused
by: UnavailableException()
        at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:19053)
        at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:1035)
        at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:1009)
        at me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:95)
        ... 13 more

By the way, I'm using Hector 0.8.0.-2 which has the following defaults:
    Default replication factor = 1
    Default replication strategy = SimpleStrategy
    Default consistency level policy = HconsistencyLevelPolicy.QUORUM
    Default failover policy = FailoverPolicy.ON_FAIL_TRY_ALL_AVAILABLE

When I first created the Schema for my cluster I used these defaults. Then I
replaced the ConsistencyLevel to ONE for reads and ANY for WRITES and I
thought everything would work if a node goes down but apparently not.

One more thing, I'm using DataStax OpsCenter to monitor and manage my
cluster. Apart from the System and OpsCenter keyspaces which aren't created
by me I have another 2 keyspaces. In total my cluster has 116 CFs. If I
click to view replication of any node I get 2 for the OpsCenter keyspace and
1 for the other two keyspaces I create, so everything seems fine. To mention
that during a node being down I could read from the OpsCenter keyspace
without a problem....I couldn't read or write to my own keyspaces.

Any idea where to look to investigate this further?

Cheers,
Alex

On Thu, Oct 27, 2011 at 10:27 PM, R. Verlangen <robin@us2.nl> wrote:

> Thats correct. It was a read consistency problem, not so smart of me ;-)
>
> Thank you anyway.
>
>
> 2011/10/27 Jonathan Ellis <jbellis@gmail.com>
>
>> (I see that you did start a new thread and solved it with Jake's help.)
>>
>> On Thu, Oct 27, 2011 at 11:23 AM, Jonathan Ellis <jbellis@gmail.com>
>> wrote:
>> > Ha.  On the one hand, good on you for searching the list archives for
>> > similar problems.  On the other hand, after over a year it's probably
>> > worth starting a new thread. :)
>> >
>> > Standard questions:
>> >
>> > - What Cassandra version are you running?
>> > - Are there exceptions in the log for the machine still running?
>> > - What does "not responding anymore" mean?  Reporting timeouts,
>> > reporting unavailable, refusing client connections, ... ?
>> >
>> > On Thu, Oct 27, 2011 at 10:22 AM, RobinUs2 <robin@us2.nl> wrote:
>> >> I'm currently having a similar problem with a 2-node cluster. When 1
>> shutdown
>> >> one of the nodes, the other isn't responding any more.
>> >>
>> >> Did you found a solution for your problem?
>> >>
>> >> /I'm new to mailing lists, if it's inappropriate to reply here, please
>> let
>> >> me know../
>> >>
>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/2-node-cluster-1-node-down-overall-failure-td6936722.html
>> >>
>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/2-node-cluster-1-node-down-overall-failure-td6936722.html
>> >>
>> >> --
>> >> View this message in context:
>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/UnavailableException-with-1-node-down-and-RF-2-tp5242055p6936767.html
>> >> Sent from the cassandra-user@incubator.apache.org mailing list archive
>> at Nabble.com.
>> >>
>> >
>> >
>> >
>> > --
>> > Jonathan Ellis
>> > Project Chair, Apache Cassandra
>> > co-founder of DataStax, the source for professional Cassandra support
>> > http://www.datastax.com
>> >
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>>
>
>

Mime
View raw message