Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of patricioe@gmail.com designates
 209.85.212.171 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:from:date:message-id:subject:to
         :cc:content-type;
        b=uciBjLmN7Dzahhm/7qkhUfEKTM07xR0PWQypmA7HvJ5FuGsxJKcxQCBaeXfTYB9+Vo
         Qdzo9fDOEFpYGmJJ9FjQOgTeFw1+mAHbERJTZUwBiXYEYDSFscvTwTqqG8hyoNGxUHfb
         GlKltUKHOFrx//sCsubzpnG0LQa9UwDzHHXeg=
MIME-Version: 1.0
In-Reply-To: <E4496A45-2558-46D2-B1A0-30D9ABB5BD75@thelastpickle.com>
References: <BANLkTin_eCymOmsV_oL-X874txBK8cOTgg@mail.gmail.com>
 <301C715B-CD4A-4C09-9E4A-A2F04635D937@joestump.net>
 <BANLkTikxMk-PGknr08qvvtQzRr-T1eZhcw@mail.gmail.com>
 <BANLkTikZY1z_QV32c8Nfmva6U90bOVPaaA@mail.gmail.com>
 <E4496A45-2558-46D2-B1A0-30D9ABB5BD75@thelastpickle.com>
From: =?ISO-8859-1?Q?Patricio_Echag=FCe?= <patricioe@gmail.com>
Date: Sat, 9 Apr 2011 21:57:03 -0700
Message-ID: <BANLkTikKQpprCWTRLCLiYD4A0FZa9EA=FA@mail.gmail.com>
Subject: Re: Site Not Surviving a Single Cassandra Node Crash
To: user@cassandra.apache.org
Cc: aaron morton <aaron@thelastpickle.com>
Content-Type: multipart/alternative; boundary=000e0cd32ed669f10604a0894b8c

--000e0cd32ed669f10604a0894b8c
Content-Type: text/plain; charset=ISO-8859-1

What is the consistency level you are using ?

And as Ed said, if you can provide the stacktrace that would help too.

On Sat, Apr 9, 2011 at 7:02 PM, aaron morton <aaron@thelastpickle.com>wrote:

> btw, the nodes are a tad out of balance was that deliberate ?
>
> http://wiki.apache.org/cassandra/Operations#Token_selection
> http://wiki.apache.org/cassandra/Operations#Load_balancing
>
>
> Aaron
>
> On 10 Apr 2011, at 08:44, Ed Anuff wrote:
>
> Sounds like the problem might be on the hector side.  Lots of hector
> users on this list, but usually not a bad idea to ask on
> hector-users@googlegroups.com (cc'd).
>
> The jetty servers stopping responding is a bit vague, somewhere in
> your logs is an error message that should shed some light on where
> things are going awry.  If you can find the exception that's being
> thrown in hector and post that, it'd make it much easier to help you
> out.
>
> Ed
>
> On Sat, Apr 9, 2011 at 12:11 PM, Vram Kouramajian
> <vram.kouramajian@gmail.com> wrote:
>
> The hector clients are used as part of our jetty servers. And, the
>
> jetty servers stop responding when one of the Cassandra nodes go down.
>
>
> Vram
>
>
> On Sat, Apr 9, 2011 at 11:54 AM, Joe Stump <joe@joestump.net> wrote:
>
> Did the Cassandra cluster go down or did you start getting failures from
> the client when it routed queries to the downed node? The key in the client
> is to keep working around the ring if the initial node is down.
>
>
> --Joe
>
>
> On Apr 9, 2011, at 12:52 PM, Vram Kouramajian wrote:
>
>
> We have a 5 Cassandra nodes with the following configuration:
>
>
> Casandra Version: 0.6.11
>
> Number of Nodes: 5
>
> Replication Factor: 3
>
> Client: Hector 0.6.0-14
>
> Write Consistency Level: Quorum
>
> Read Consistency Level: Quorum
>
> Ring Topology:
>
>   Owns    Range                                      Ring
>
>
> 132756707369141912386052673276321963528
>
> 192.168.89.153Up         4.15 GB       33.87%
>
> 20237398133070283622632741498697119875     |<--|
>
> 192.168.89.155Up         5.17 GB       18.29%
>
> 51358066040236348437506517944084891398     |   ^
>
> 192.168.89.154Up         7.41 GB       33.97%
>
> 109158969152851862753910401160326064203    v   |
>
> 192.168.89.152Up         5.07 GB       6.34%
>
> 119944993359936402983569623214763193674    |   ^
>
> 192.168.89.151Up         4.22 GB       7.53%
>
> 132756707369141912386052673276321963528    |-->|
>
>
> We believe that our setup should survive the crash of one of the
>
> Cassandra nodes. But, we had few crashes and the system stopped
>
> functioning until we brought back the Cassandra nodes.
>
>
> Any clues?
>
>
> Vram
>
>
>
>
>
>

--000e0cd32ed669f10604a0894b8c
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

What is the consistency level you are using ?<div><br></div><div>And as Ed =
said, if you can provide the stacktrace that would help too.<br><br><div cl=
ass=3D"gmail_quote">On Sat, Apr 9, 2011 at 7:02 PM, aaron morton <span dir=
=3D"ltr">&lt;<a href=3D"mailto:aaron@thelastpickle.com">aaron@thelastpickle=
.com</a>&gt;</span> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex;"><div style=3D"word-wrap:break-word">btw, th=
e nodes are a tad out of balance was that deliberate ?=A0<div><br></div><di=
v><a href=3D"http://wiki.apache.org/cassandra/Operations#Token_selection" t=
arget=3D"_blank">http://wiki.apache.org/cassandra/Operations#Token_selectio=
n</a></div>

<div><a href=3D"http://wiki.apache.org/cassandra/Operations#Load_balancing"=
 target=3D"_blank">http://wiki.apache.org/cassandra/Operations#Load_balanci=
ng</a></div><div><br></div><font color=3D"#888888"><div><br></div><div>Aaro=
n</div>

</font><div><div></div><div class=3D"h5"><div><br><div><div>On 10 Apr 2011,=
 at 08:44, Ed Anuff wrote:</div><br><blockquote type=3D"cite"><div>Sounds l=
ike the problem might be on the hector side. =A0Lots of hector<br>users on =
this list, but usually not a bad idea to ask on<br>

<a href=3D"mailto:hector-users@googlegroups.com" target=3D"_blank">hector-u=
sers@googlegroups.com</a> (cc&#39;d).<br><br>The jetty servers stopping res=
ponding is a bit vague, somewhere in<br>your logs is an error message that =
should shed some light on where<br>

things are going awry. =A0If you can find the exception that&#39;s being<br=
>thrown in hector and post that, it&#39;d make it much easier to help you<b=
r>out.<br><br>Ed<br><br>On Sat, Apr 9, 2011 at 12:11 PM, Vram Kouramajian<b=
r>

&lt;<a href=3D"mailto:vram.kouramajian@gmail.com" target=3D"_blank">vram.ko=
uramajian@gmail.com</a>&gt; wrote:<br><blockquote type=3D"cite">The hector =
clients are used as part of our jetty servers. And, the<br></blockquote><bl=
ockquote type=3D"cite">

jetty servers stop responding when one of the Cassandra nodes go down.<br><=
/blockquote><blockquote type=3D"cite"><br></blockquote><blockquote type=3D"=
cite">Vram<br></blockquote><blockquote type=3D"cite"><br></blockquote><bloc=
kquote type=3D"cite">

On Sat, Apr 9, 2011 at 11:54 AM, Joe Stump &lt;<a href=3D"mailto:joe@joestu=
mp.net" target=3D"_blank">joe@joestump.net</a>&gt; wrote:<br></blockquote><=
blockquote type=3D"cite"><blockquote type=3D"cite">Did the Cassandra cluste=
r go down or did you start getting failures from the client when it routed =
queries to the downed node? The key in the client is to keep working around=
 the ring if the initial node is down.<br>

</blockquote></blockquote><blockquote type=3D"cite"><blockquote type=3D"cit=
e"><br></blockquote></blockquote><blockquote type=3D"cite"><blockquote type=
=3D"cite">--Joe<br></blockquote></blockquote><blockquote type=3D"cite"><blo=
ckquote type=3D"cite">

<br></blockquote></blockquote><blockquote type=3D"cite"><blockquote type=3D=
"cite">On Apr 9, 2011, at 12:52 PM, Vram Kouramajian wrote:<br></blockquote=
></blockquote><blockquote type=3D"cite"><blockquote type=3D"cite"><br></blo=
ckquote>

</blockquote><blockquote type=3D"cite"><blockquote type=3D"cite"><blockquot=
e type=3D"cite">We have a 5 Cassandra nodes with the following configuratio=
n:<br></blockquote></blockquote></blockquote><blockquote type=3D"cite"><blo=
ckquote type=3D"cite">

<blockquote type=3D"cite"><br></blockquote></blockquote></blockquote><block=
quote type=3D"cite"><blockquote type=3D"cite"><blockquote type=3D"cite">Cas=
andra Version: 0.6.11<br></blockquote></blockquote></blockquote><blockquote=
 type=3D"cite">

<blockquote type=3D"cite"><blockquote type=3D"cite">Number of Nodes: 5<br><=
/blockquote></blockquote></blockquote><blockquote type=3D"cite"><blockquote=
 type=3D"cite"><blockquote type=3D"cite">Replication Factor: 3<br></blockqu=
ote></blockquote>

</blockquote><blockquote type=3D"cite"><blockquote type=3D"cite"><blockquot=
e type=3D"cite">Client: Hector 0.6.0-14<br></blockquote></blockquote></bloc=
kquote><blockquote type=3D"cite"><blockquote type=3D"cite"><blockquote type=
=3D"cite">

Write Consistency Level: Quorum<br></blockquote></blockquote></blockquote><=
blockquote type=3D"cite"><blockquote type=3D"cite"><blockquote type=3D"cite=
">Read Consistency Level: Quorum<br></blockquote></blockquote></blockquote>=
<blockquote type=3D"cite">

<blockquote type=3D"cite"><blockquote type=3D"cite">Ring Topology:<br></blo=
ckquote></blockquote></blockquote><blockquote type=3D"cite"><blockquote typ=
e=3D"cite"><blockquote type=3D"cite">=A0 Owns =A0 =A0Range =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Ring<br>

</blockquote></blockquote></blockquote><blockquote type=3D"cite"><blockquot=
e type=3D"cite"><blockquote type=3D"cite"><br></blockquote></blockquote></b=
lockquote><blockquote type=3D"cite"><blockquote type=3D"cite"><blockquote t=
ype=3D"cite">

132756707369141912386052673276321963528<br></blockquote></blockquote></bloc=
kquote><blockquote type=3D"cite"><blockquote type=3D"cite"><blockquote type=
=3D"cite">192.168.89.153Up =A0 =A0 =A0 =A0 4.15 GB =A0 =A0 =A0 33.87%<br></=
blockquote></blockquote>

</blockquote><blockquote type=3D"cite"><blockquote type=3D"cite"><blockquot=
e type=3D"cite">20237398133070283622632741498697119875 =A0 =A0 |&lt;--|<br>=
</blockquote></blockquote></blockquote><blockquote type=3D"cite"><blockquot=
e type=3D"cite">

<blockquote type=3D"cite">192.168.89.155Up =A0 =A0 =A0 =A0 5.17 GB =A0 =A0 =
=A0 18.29%<br></blockquote></blockquote></blockquote><blockquote type=3D"ci=
te"><blockquote type=3D"cite"><blockquote type=3D"cite">5135806604023634843=
7506517944084891398 =A0 =A0 | =A0 ^<br>

</blockquote></blockquote></blockquote><blockquote type=3D"cite"><blockquot=
e type=3D"cite"><blockquote type=3D"cite">192.168.89.154Up =A0 =A0 =A0 =A0 =
7.41 GB =A0 =A0 =A0 33.97%<br></blockquote></blockquote></blockquote><block=
quote type=3D"cite">

<blockquote type=3D"cite"><blockquote type=3D"cite">10915896915285186275391=
0401160326064203 =A0 =A0v =A0 |<br></blockquote></blockquote></blockquote><=
blockquote type=3D"cite"><blockquote type=3D"cite"><blockquote type=3D"cite=
">192.168.89.152Up =A0 =A0 =A0 =A0 5.07 GB =A0 =A0 =A0 6.34%<br>

</blockquote></blockquote></blockquote><blockquote type=3D"cite"><blockquot=
e type=3D"cite"><blockquote type=3D"cite">119944993359936402983569623214763=
193674 =A0 =A0| =A0 ^<br></blockquote></blockquote></blockquote><blockquote=
 type=3D"cite">

<blockquote type=3D"cite"><blockquote type=3D"cite">192.168.89.151Up =A0 =
=A0 =A0 =A0 4.22 GB =A0 =A0 =A0 7.53%<br></blockquote></blockquote></blockq=
uote><blockquote type=3D"cite"><blockquote type=3D"cite"><blockquote type=
=3D"cite">132756707369141912386052673276321963528 =A0 =A0|--&gt;|<br>

</blockquote></blockquote></blockquote><blockquote type=3D"cite"><blockquot=
e type=3D"cite"><blockquote type=3D"cite"><br></blockquote></blockquote></b=
lockquote><blockquote type=3D"cite"><blockquote type=3D"cite"><blockquote t=
ype=3D"cite">

We believe that our setup should survive the crash of one of the<br></block=
quote></blockquote></blockquote><blockquote type=3D"cite"><blockquote type=
=3D"cite"><blockquote type=3D"cite">Cassandra nodes. But, we had few crashe=
s and the system stopped<br>

</blockquote></blockquote></blockquote><blockquote type=3D"cite"><blockquot=
e type=3D"cite"><blockquote type=3D"cite">functioning until we brought back=
 the Cassandra nodes.<br></blockquote></blockquote></blockquote><blockquote=
 type=3D"cite">

<blockquote type=3D"cite"><blockquote type=3D"cite"><br></blockquote></bloc=
kquote></blockquote><blockquote type=3D"cite"><blockquote type=3D"cite"><bl=
ockquote type=3D"cite">Any clues?<br></blockquote></blockquote></blockquote=
><blockquote type=3D"cite">

<blockquote type=3D"cite"><blockquote type=3D"cite"><br></blockquote></bloc=
kquote></blockquote><blockquote type=3D"cite"><blockquote type=3D"cite"><bl=
ockquote type=3D"cite">Vram<br></blockquote></blockquote></blockquote><bloc=
kquote type=3D"cite">

<blockquote type=3D"cite"><br></blockquote></blockquote><blockquote type=3D=
"cite"><blockquote type=3D"cite"><br></blockquote></blockquote><blockquote =
type=3D"cite"><br></blockquote></div></blockquote></div><br></div></div></d=
iv>
</div>
</blockquote></div><br></div>

--000e0cd32ed669f10604a0894b8c--