Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: local policy)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=content-type
	:mime-version:subject:from:in-reply-to:date
	:content-transfer-encoding:message-id:references:to; q=dns; s=
	thelastpickle.com; b=Tt85GgvnJz66tDlUfYp4Htbn9qspY4dw0ah41aVHJew
	1kBNwkTTM0+hI7xFPgRXCXEgBHLO2OX/If6/XLoqhNXaK8MzlYraaMI//sVk0iKC
	/8s+XaOch4wXoOs2clrOUTlU4WSUknp9933Q3HT/lhCyUKSh+sNb6TnDcuaT2F4Y
	=
Content-Type: text/plain; charset=windows-1252
Mime-Version: 1.0 (Apple Message framework v1244.3)
Subject: Re: performance problems on new cluster
From: aaron morton <aaron@thelastpickle.com>
In-Reply-To: <4E447878.9010600@myrddin.org>
Date: Fri, 12 Aug 2011 14:11:06 +1200
Content-Transfer-Encoding: quoted-printable
Message-Id: <2B5FEBB9-E75C-48B4-B9B9-B7EB52D44E41@thelastpickle.com>
References: <4E43F254.4020109@myrddin.org>
 <5D7B781B-D8BD-4B13-8426-84BFBFCDA88B@thelastpickle.com>
 <4E447878.9010600@myrddin.org>
To: user@cassandra.apache.org

>=20
> iostat doesn't show a request queue bottleneck.  The timeouts we are =
seeing is for reads.  The latency on the nodes I have temporarily used =
for reads is around 2-45ms.  The next token in the ring at an alternate =
DC is showing ~4ms with everything else around 0.05ms.  tpstats desn't =
show any active/pending.  Reads are at CL.ONE & Writes using CL.ANY

OK, node latency is fine and you are using some pretty low consistency. =
You said NTS with RF 2, is that RF 2 for each DC ?=20

The steps below may help get an idea of whats going on=85

1) use nodetool getendpoints to determine which replicas a key is. =20
2) connect directly to one of the endpoints with the CLI, ensure CL is =
ONE and do your test query.=20
3) connect to another node in the same DC that is not a replica and do =
the same.=20
4) connect to another node in a different DC and do the same=20

Once you can repo it try turning up the logging not the coordinator to =
DEBUG you can do this via JConsole. Look for these lines=85.

* Command/ConsistencyLevel is=85.
* reading data locally... or reading data from=85
* reading digest locally=85 or reading digest for from=85
* Read timeout:=85

You'll also see some lines about receiving messages from other nodes. =20=


Hopefully you can get an idea of which nodes are involved in a failing =
query. Getting a thrift TimedOutException on a read with CL ONE is =
pretty odd.=20

> What can I do in regards to confirming this issue is still outstanding =
and/or we are affected by it?
It's in 0.8 and will not be fixed. My unscientific approach was to =
repair a single CF at a time, hoping that the differences would be =
smaller and less data would be streamed.=20
Minor compaction should help squish things down. If you want to get more =
aggressive reduce the min compaction threshold and trigger a minor =
compaction with nodetool flush.  =20

> Version of failure detection?  I've not seen anything on this so I =
suspect this is the default.
Was asking so I could see if their were any fixed in Gossip or the =
FailureDetect that you were missing. Check the CHANGES.txt file.=20

Hope that helps.=20

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 12 Aug 2011, at 12:48, Anton Winter wrote:

>=20
>> Is there a reason you are using the trunk and not one of the tagged =
releases? Official releases are a lot more stable than the trunk.
>>=20
> Yes, as we are using a combination of Ec2 and colo servers we are =
needing to use broadcast_address from CASSANDRA-2491.  The patch that is =
associated with that JIRA does not apply cleanly against 0.8 so this is =
why we are using trunk.
>=20
>>> 1) thrift timeouts & general degraded response times
>> For read or writes ? What sort of queries are you running ? Check the =
local latency on each node using cfstats and cfhistogram, and a bit of =
iostat http://spyced.blogspot.com/2010/01/linux-performance-basics.html =
What does nodetool tpstats say, is there a stage backing up?
>>=20
>> If the local latency is OK look at the cross DC situation. What CL =
are you using? Are nodes timing out waiting for nodes in other DC's ?
>=20
> iostat doesn't show a request queue bottleneck.  The timeouts we are =
seeing is for reads.  The latency on the nodes I have temporarily used =
for reads is around 2-45ms.  The next token in the ring at an alternate =
DC is showing ~4ms with everything else around 0.05ms.  tpstats desn't =
show any active/pending.  Reads are at CL.ONE & Writes using CL.ANY
>=20
>>=20
>>> 2) *lots* of exception errors, such as:
>> Repair is trying to run on a response which is a digest response, =
this should not be happening. Can you provide some more info on the type =
of query you are running ?
>>=20
> The query being run is  get cf1['user-id']['seg']
>=20
>=20
>>> 3) ring imbalances during a repair (refer to the above nodetool ring =
output)
>> You may be seeing this
>> https://issues.apache.org/jira/browse/CASSANDRA-2280
>> I think it's a mistake that is it marked as resolved.
>>=20
> What can I do in regards to confirming this issue is still outstanding =
and/or we are affected by it?
>=20
>>> 4) regular failure detection when any node does something only =
moderately stressful, such as a repair or are under light load etc. but =
the node itself thinks it is fine.
>> What version are you using ?
>>=20
> Version of failure detection?  I've not seen anything on this so I =
suspect this is the default.
>=20
>=20
> Thanks,
> Anton
>=20