Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of scode@scode.org designates
 209.85.215.172 as permitted sender)
MIME-Version: 1.0
Sender: scode@scode.org
In-Reply-To: <4ECA2E22.9020801@ic-drei.de>
References: <4ECA2E22.9020801@ic-drei.de>
Date: Thu, 24 Nov 2011 09:42:20 -0800
Message-ID: 
 <CAO5xsd074ET0ZBcK7c7fA0Kwcs6pdOOgZdSe==D9wa-u8X8rsA@mail.gmail.com>
Subject: Re: Pending ReadStage is exploding on only one node
From: Peter Schuller <peter.schuller@infidyne.com>
To: user@cassandra.apache.org
Content-Type: text/plain; charset=UTF-8

> I'm measuring a high load value on a few nodes during the update process
> (which is normal), but one node keeps the high load after the process for a
> long time.

I would say that either the reading that you to is overloading that
one node and other traffic is getting piled up as a result, or you're
stomping on page cache by reading a lot from that one node (e.g. using
CL.ONE) and you're then seeing readstage backed up until the page
cache or row cache is warm again.

In general, unless you're running at close to full CPU capacity it
sounds like you're completely disk bound, and that'll show up as a
huge amount of pending ReadStage. "iostat -x -k 1" should confirm it.

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)