Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of isoboroff@gmail.com
 designates 209.85.221.192 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type;
        b=hYRZeXkd5u+JH/zpaWhGFOGBJv+6vx906UJEveX1Jakt2FmCNA9kKJKZaxJZrAnK4L
         xJqHHTLYoZ4hAnzRrkdy2Jr751ubm+gIkKXIiUjFoT+ceABbvq1QynSgwe0Dd5UXEtnk
         FwAZCfKx64AZRIQHnAePKZv7vXp5dEKFaTEms=
MIME-Version: 1.0
In-Reply-To: <AANLkTikjBMwhusHKDJ7HnUiId5ZDz6zgIqYxZr1e-YX1@mail.gmail.com>
References: <AANLkTil4jdDw6zgdBXZAFI1ZA-6VKbtSPvdGwsEQKK7Y@mail.gmail.com>
	 <AANLkTim0hOsV001CFM_OY8fvDoEzXHX8Rv0mqhVJWZDA@mail.gmail.com>
	 <AANLkTimQJ5xpm1R7vk5Fq2xfUjJdL-K2WVkeZ6Uf9b_g@mail.gmail.com>
	 <AANLkTin12loM4y7O5SW3jyOxCnYAeBepnx5siCyzr39u@mail.gmail.com>
	 <AANLkTikla6iFoVA-qdjBORR89AJO7EbuXkxigGO5dfXO@mail.gmail.com>
	 <AANLkTin9H3cjs2CYsovNNmSOJw3tu_V_TVOoDY-Rh32B@mail.gmail.com>
	 <AANLkTikRnP2387oS2gt3QUiJ4wTPDY51vOtFsrEVTk3D@mail.gmail.com>
	 <AANLkTikjBMwhusHKDJ7HnUiId5ZDz6zgIqYxZr1e-YX1@mail.gmail.com>
Date: Sat, 22 May 2010 22:07:03 -0400
Message-ID: <AANLkTilSDTHm3RNJFDHYrKB9cy54Qf57kQYnes1Z2YlT@mail.gmail.com>
Subject: Re: Scaling problems
From: Ian Soboroff <isoboroff@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=00163646b5de6606a304873961cb

--00163646b5de6606a304873961cb
Content-Type: text/plain; charset=ISO-8859-1

I'll try this.  HH backs up because nodes are failing.  I haven't read the
code, but why should HH suck CPU?  As I understand it, there's nothing to
hand off until the destination comes back up, and Gossip should tell us
that, no?  In the interim, it's just a cache of writes waiting to be sent.

Is there some way to tell the system "Just stop caring, I'm just writing,
let's worry about leveling out when I get around to wanting to read?"

Ian

On Fri, May 21, 2010 at 9:06 PM, Jonathan Ellis <jbellis@gmail.com> wrote:

> On Fri, May 21, 2010 at 9:09 AM, Ian Soboroff <isoboroff@gmail.com> wrote:
> > HINTED-HANDOFF-POOL               1       158             23
>
> this is your smoking gun.  HH tasks suck a ton of CPU and you have 158
> backed up.
>
> i would just blow the HH files away from your data/system directory,
> restart the node, and run repair (assuming all your other nodes are
> alive again).
>

--00163646b5de6606a304873961cb
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

I&#39;ll try this.=A0 HH backs up because nodes are failing.=A0 I haven&#39=
;t read the code, but why should HH suck CPU?=A0 As I understand it, there&=
#39;s nothing to hand off until the destination comes back up, and Gossip s=
hould tell us that, no?=A0 In the interim, it&#39;s just a cache of writes =
waiting to be sent.<br>
<br>Is there some way to tell the system &quot;Just stop caring, I&#39;m ju=
st writing, let&#39;s worry about leveling out when I get around to wanting=
 to read?&quot;<br><br>Ian<br><br><div class=3D"gmail_quote">On Fri, May 21=
, 2010 at 9:06 PM, Jonathan Ellis <span dir=3D"ltr">&lt;<a href=3D"mailto:j=
bellis@gmail.com">jbellis@gmail.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; borde=
r-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div class=3D"im"=
>On Fri, May 21, 2010 at 9:09 AM, Ian Soboroff &lt;<a href=3D"mailto:isobor=
off@gmail.com">isoboroff@gmail.com</a>&gt; wrote:<br>

&gt; HINTED-HANDOFF-POOL=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 1=A0=A0=
=A0=A0=A0=A0 158=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 23<br>
<br>
</div>this is your smoking gun. =A0HH tasks suck a ton of CPU and you have =
158<br>
backed up.<br>
<br>
i would just blow the HH files away from your data/system directory,<br>
restart the node, and run repair (assuming all your other nodes are<br>
alive again).<br>
</blockquote></div><br>

--00163646b5de6606a304873961cb--