Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of sicoe.alexandru@googlemail.com
 designates 209.85.216.179 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CADVHTB9CRfMZewbnVnPNCp4HMWwUCW5y7+T5in5dxKq=7r1dYg@mail.gmail.com>
References: <AANLkTimz-1ZkXTF3tbrv8zsv8YCkMQhK6cahTZ8l1hgU@mail.gmail.com>
	<AANLkTikRELf4Ki3eprKThPnyLZNyv42JYmpgvb6dKG-8@mail.gmail.com>
	<B8817D70-58E9-40A1-9B2F-6451AADA776F@gmail.com>
	<AANLkTim-jgdZOC9tEpc2DamG6aJRenZhwddWBWyGaeTg@mail.gmail.com>
	<AANLkTikPOPGv64RZWh8UAPsRkzFZ57dKRhnE0uQqu5sX@mail.gmail.com>
	<AANLkTilZ9WVpBtzROosf534DEc7tAlsbh0G4hAMc7Ztl@mail.gmail.com>
	<1319728960375-6936767.post@n2.nabble.com>
	<CALdd-zg1wPmmuOMQzCgmCj7c3w_5D=JPuMUHALkF=Ro8zsuuXQ@mail.gmail.com>
	<CALdd-zit4oFgxn54nBXpGMvwY6G9kDqVsnaJ8NY3x+pLE5B-bA@mail.gmail.com>
	<CADVHTB9CRfMZewbnVnPNCp4HMWwUCW5y7+T5in5dxKq=7r1dYg@mail.gmail.com>
Date: Fri, 28 Oct 2011 09:59:53 +0200
Message-ID: 
 <CACCYQcysMcV_VrBvEDDceRxrDhihBkc__k_o8_3BnmKfeaEG4Q@mail.gmail.com>
Subject: Re: UnavailableException with 1 node down and RF=2?
From: Alexandru Dan Sicoe <sicoe.alexandru@googlemail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=00163642687334f80304b0574607

--00163642687334f80304b0574607
Content-Type: text/plain; charset=ISO-8859-1

Hi guys,
 It's interesting to see this thread. I recently discovered a similar
problem on my 3 node Cassandra 0.8.5 cluster. It was working fine, then I
took a node down to see how it behaves. All of a sudden I couldn't write or
read because of this exception being thrown:

Exception in thread "main"
me.prettyprint.hector.api.exceptions.HUnavailableException: : May not
be enough replicas present to handle consistency level.
        at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:60)
        at me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:97)
        at me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:90)
        at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101)
        at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:232)
        at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
        at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:102)
        at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:108)
        at me.prettyprint.cassandra.model.MutatorImpl$3.doInKeyspace(MutatorImpl.java:222)
        at me.prettyprint.cassandra.model.MutatorImpl$3.doInKeyspace(MutatorImpl.java:219)
        at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
        at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
        at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:219)
        at ch.cern.pbeast.CassandraDBClient.executeBatchInsert(CassandraDBClient.java:958)
        at ch.cern.test.TimeBinTester.main(TimeBinTester.java:294)Caused
by: UnavailableException()
        at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:19053)
        at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:1035)
        at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:1009)
        at me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:95)
        ... 13 more

By the way, I'm using Hector 0.8.0.-2 which has the following defaults:
    Default replication factor = 1
    Default replication strategy = SimpleStrategy
    Default consistency level policy = HconsistencyLevelPolicy.QUORUM
    Default failover policy = FailoverPolicy.ON_FAIL_TRY_ALL_AVAILABLE

When I first created the Schema for my cluster I used these defaults. Then I
replaced the ConsistencyLevel to ONE for reads and ANY for WRITES and I
thought everything would work if a node goes down but apparently not.

One more thing, I'm using DataStax OpsCenter to monitor and manage my
cluster. Apart from the System and OpsCenter keyspaces which aren't created
by me I have another 2 keyspaces. In total my cluster has 116 CFs. If I
click to view replication of any node I get 2 for the OpsCenter keyspace and
1 for the other two keyspaces I create, so everything seems fine. To mention
that during a node being down I could read from the OpsCenter keyspace
without a problem....I couldn't read or write to my own keyspaces.

Any idea where to look to investigate this further?

Cheers,
Alex

On Thu, Oct 27, 2011 at 10:27 PM, R. Verlangen <robin@us2.nl> wrote:

> Thats correct. It was a read consistency problem, not so smart of me ;-)
>
> Thank you anyway.
>
>
> 2011/10/27 Jonathan Ellis <jbellis@gmail.com>
>
>> (I see that you did start a new thread and solved it with Jake's help.)
>>
>> On Thu, Oct 27, 2011 at 11:23 AM, Jonathan Ellis <jbellis@gmail.com>
>> wrote:
>> > Ha.  On the one hand, good on you for searching the list archives for
>> > similar problems.  On the other hand, after over a year it's probably
>> > worth starting a new thread. :)
>> >
>> > Standard questions:
>> >
>> > - What Cassandra version are you running?
>> > - Are there exceptions in the log for the machine still running?
>> > - What does "not responding anymore" mean?  Reporting timeouts,
>> > reporting unavailable, refusing client connections, ... ?
>> >
>> > On Thu, Oct 27, 2011 at 10:22 AM, RobinUs2 <robin@us2.nl> wrote:
>> >> I'm currently having a similar problem with a 2-node cluster. When 1
>> shutdown
>> >> one of the nodes, the other isn't responding any more.
>> >>
>> >> Did you found a solution for your problem?
>> >>
>> >> /I'm new to mailing lists, if it's inappropriate to reply here, please
>> let
>> >> me know../
>> >>
>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/2-node-cluster-1-node-down-overall-failure-td6936722.html
>> >>
>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/2-node-cluster-1-node-down-overall-failure-td6936722.html
>> >>
>> >> --
>> >> View this message in context:
>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/UnavailableException-with-1-node-down-and-RF-2-tp5242055p6936767.html
>> >> Sent from the cassandra-user@incubator.apache.org mailing list archive
>> at Nabble.com.
>> >>
>> >
>> >
>> >
>> > --
>> > Jonathan Ellis
>> > Project Chair, Apache Cassandra
>> > co-founder of DataStax, the source for professional Cassandra support
>> > http://www.datastax.com
>> >
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>>
>
>

--00163642687334f80304b0574607
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hi guys,<br>=A0It&#39;s interesting to see this thread. I recently discover=
ed a similar problem on my 3 node Cassandra 0.8.5 cluster. It was working f=
ine, then I took a node down to see how it behaves. All of a sudden I could=
n&#39;t write or read because of this exception being thrown:<br>


=09
=09
=09
	<style type=3D"text/css">pre.western { font-family: "Liberation Mono","Cou=
rier New",monospace; }pre.cjk { font-family: "DejaVu LGC Sans Mono",monospa=
ce; }pre.ctl { font-family: "Liberation Mono","Courier New",monospace; }p {=
 margin-bottom: 0.08in; }</style>

<pre class=3D"western" style=3D"margin-left: 0.65in;"><font style=3D"font-s=
ize: 6pt;" size=3D"1">Exception in thread &quot;main&quot; me.prettyprint.h=
ector.api.exceptions.HUnavailableException: : May not be enough replicas pr=
esent to handle consistency level.
</font>
        <font style=3D"font-size: 6pt;" size=3D"1">at me.prettyprint.cassan=
dra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.jav=
a:60)
</font>
        <font style=3D"font-size: 6pt;" size=3D"1">at me.prettyprint.cassan=
dra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:97)
</font>
        <font style=3D"font-size: 6pt;" size=3D"1">at me.prettyprint.cassan=
dra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:90)
</font>
        <font style=3D"font-size: 6pt;" size=3D"1">at me.prettyprint.cassan=
dra.service.Operation.executeAndSetResult(Operation.java:101)
</font>
        <font style=3D"font-size: 6pt;" size=3D"1">at me.prettyprint.cassan=
dra.connection.HConnectionManager.operateWithFailover(HConnectionManager.ja=
va:232)
</font>
        <font style=3D"font-size: 6pt;" size=3D"1">at me.prettyprint.cassan=
dra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.jav=
a:131)
</font>
        <font style=3D"font-size: 6pt;" size=3D"1">at me.prettyprint.cassan=
dra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:102)
</font>
        <font style=3D"font-size: 6pt;" size=3D"1">at me.prettyprint.cassan=
dra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:108)
</font>
        <font style=3D"font-size: 6pt;" size=3D"1">at me.prettyprint.cassan=
dra.model.MutatorImpl$3.doInKeyspace(MutatorImpl.java:222)
</font>
        <font style=3D"font-size: 6pt;" size=3D"1">at me.prettyprint.cassan=
dra.model.MutatorImpl$3.doInKeyspace(MutatorImpl.java:219)
</font>
        <font style=3D"font-size: 6pt;" size=3D"1">at me.prettyprint.cassan=
dra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperatio=
nCallback.java:20)
</font>
        <font style=3D"font-size: 6pt;" size=3D"1">at me.prettyprint.cassan=
dra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
</font>
        <font style=3D"font-size: 6pt;" size=3D"1">at me.prettyprint.cassan=
dra.model.MutatorImpl.execute(MutatorImpl.java:219)
</font>
        <font style=3D"font-size: 6pt;" size=3D"1">at ch.cern.pbeast.Cassan=
draDBClient.executeBatchInsert(CassandraDBClient.java:958)
</font>
        <font style=3D"font-size: 6pt;" size=3D"1">at ch.cern.test.TimeBinT=
ester.main(TimeBinTester.java:294)
</font>
<font style=3D"font-size: 6pt;" size=3D"1">Caused by: UnavailableException(=
)
</font>
        <font style=3D"font-size: 6pt;" size=3D"1">at org.apache.cassandra.=
thrift.Cassandra$batch_mutate_result.read(Cassandra.java:19053)
</font>
        <font style=3D"font-size: 6pt;" size=3D"1">at org.apache.cassandra.=
thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:1035)
</font>
        <font style=3D"font-size: 6pt;" size=3D"1">at org.apache.cassandra.=
thrift.Cassandra$Client.batch_mutate(Cassandra.java:1009)
</font>
        <font style=3D"font-size: 6pt;" size=3D"1">at me.prettyprint.cassan=
dra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:95)
</font>
        <font style=3D"font-size: 6pt;" size=3D"1">... 13 more
</font></pre>
By the way, I&#39;m using Hector 0.8.0.-2 which has the following defaults:=
<br>=A0=A0=A0 Default replication factor =3D 1<br>=A0=A0=A0 Default replica=
tion strategy =3D SimpleStrategy<br>=A0=A0=A0 Default consistency level pol=
icy =3D HconsistencyLevelPolicy.QUORUM<br>
=A0=A0=A0 Default failover policy =3D FailoverPolicy.ON_FAIL_TRY_ALL_AVAILA=
BLE<br><br>When I first created the Schema for my cluster I used these defa=
ults. Then I replaced the ConsistencyLevel to ONE for reads and ANY for WRI=
TES and I thought everything would work if a node goes down but apparently =
not.<br>
<br>One more thing, I&#39;m using DataStax OpsCenter to monitor and manage =
my cluster. Apart from the System and OpsCenter keyspaces which aren&#39;t =
created by me I have another 2 keyspaces. In total my cluster has 116 CFs. =
If I click to view replication of any node I get 2 for the OpsCenter keyspa=
ce and 1 for the other two keyspaces I create, so everything seems fine. To=
 mention that during a node being down I could read from the OpsCenter keys=
pace without a problem....I couldn&#39;t read or write to my own keyspaces.=
<br>

<br>Any idea where to look to investigate this further?<br><br>Cheers,<br>A=
lex<br><br><div class=3D"gmail_quote">On Thu, Oct 27, 2011 at 10:27 PM, R. =
Verlangen <span dir=3D"ltr">&lt;<a href=3D"mailto:robin@us2.nl">robin@us2.n=
l</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; borde=
r-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">Thats correct. It=
 was a read consistency problem, not so smart of me ;-)<div><br></div><div>
Thank you anyway.<div><div></div><div class=3D"h5"><br><br><div class=3D"gm=
ail_quote">2011/10/27 Jonathan Ellis <span dir=3D"ltr">&lt;<a href=3D"mailt=
o:jbellis@gmail.com" target=3D"_blank">jbellis@gmail.com</a>&gt;</span><br>
<blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; borde=
r-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">(I see that you d=
id start a new thread and solved it with Jake&#39;s help.)<br>
<div><div></div><div><br>
On Thu, Oct 27, 2011 at 11:23 AM, Jonathan Ellis &lt;<a href=3D"mailto:jbel=
lis@gmail.com" target=3D"_blank">jbellis@gmail.com</a>&gt; wrote:<br>
&gt; Ha. =A0On the one hand, good on you for searching the list archives fo=
r<br>
&gt; similar problems. =A0On the other hand, after over a year it&#39;s pro=
bably<br>
&gt; worth starting a new thread. :)<br>
&gt;<br>
&gt; Standard questions:<br>
&gt;<br>
&gt; - What Cassandra version are you running?<br>
&gt; - Are there exceptions in the log for the machine still running?<br>
&gt; - What does &quot;not responding anymore&quot; mean? =A0Reporting time=
outs,<br>
&gt; reporting unavailable, refusing client connections, ... ?<br>
&gt;<br>
&gt; On Thu, Oct 27, 2011 at 10:22 AM, RobinUs2 &lt;<a href=3D"mailto:robin=
@us2.nl" target=3D"_blank">robin@us2.nl</a>&gt; wrote:<br>
&gt;&gt; I&#39;m currently having a similar problem with a 2-node cluster. =
When 1 shutdown<br>
&gt;&gt; one of the nodes, the other isn&#39;t responding any more.<br>
&gt;&gt;<br>
&gt;&gt; Did you found a solution for your problem?<br>
&gt;&gt;<br>
&gt;&gt; /I&#39;m new to mailing lists, if it&#39;s inappropriate to reply =
here, please let<br>
&gt;&gt; me know../<br>
&gt;&gt; <a href=3D"http://cassandra-user-incubator-apache-org.3065146.n2.n=
abble.com/2-node-cluster-1-node-down-overall-failure-td6936722.html" target=
=3D"_blank">http://cassandra-user-incubator-apache-org.3065146.n2.nabble.co=
m/2-node-cluster-1-node-down-overall-failure-td6936722.html</a><br>


&gt;&gt; <a href=3D"http://cassandra-user-incubator-apache-org.3065146.n2.n=
abble.com/2-node-cluster-1-node-down-overall-failure-td6936722.html" target=
=3D"_blank">http://cassandra-user-incubator-apache-org.3065146.n2.nabble.co=
m/2-node-cluster-1-node-down-overall-failure-td6936722.html</a><br>


&gt;&gt;<br>
&gt;&gt; --<br>
&gt;&gt; View this message in context: <a href=3D"http://cassandra-user-inc=
ubator-apache-org.3065146.n2.nabble.com/UnavailableException-with-1-node-do=
wn-and-RF-2-tp5242055p6936767.html" target=3D"_blank">http://cassandra-user=
-incubator-apache-org.3065146.n2.nabble.com/UnavailableException-with-1-nod=
e-down-and-RF-2-tp5242055p6936767.html</a><br>


&gt;&gt; Sent from the <a href=3D"mailto:cassandra-user@incubator.apache.or=
g" target=3D"_blank">cassandra-user@incubator.apache.org</a> mailing list a=
rchive at Nabble.com.<br>
&gt;&gt;<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt; --<br>
&gt; Jonathan Ellis<br>
&gt; Project Chair, Apache Cassandra<br>
&gt; co-founder of DataStax, the source for professional Cassandra support<=
br>
&gt; <a href=3D"http://www.datastax.com" target=3D"_blank">http://www.datas=
tax.com</a><br>
&gt;<br>
<br>
<br>
<br>
--<br>
Jonathan Ellis<br>
Project Chair, Apache Cassandra<br>
co-founder of DataStax, the source for professional Cassandra support<br>
<a href=3D"http://www.datastax.com" target=3D"_blank">http://www.datastax.c=
om</a><br>
</div></div></blockquote></div><br></div></div></div>
</blockquote></div>

--00163642687334f80304b0574607--