Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A61CA9B65 for ; Sun, 8 Apr 2012 21:16:35 +0000 (UTC) Received: (qmail 25016 invoked by uid 500); 8 Apr 2012 21:16:33 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 24994 invoked by uid 500); 8 Apr 2012 21:16:33 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 24986 invoked by uid 99); 8 Apr 2012 21:16:33 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 08 Apr 2012 21:16:33 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a91.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 08 Apr 2012 21:16:25 +0000 Received: from homiemail-a91.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a91.g.dreamhost.com (Postfix) with ESMTP id 79AE4AE069 for ; Sun, 8 Apr 2012 14:16:03 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=content-type :mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; q=dns; s= thelastpickle.com; b=00cfwc9tjenNOBlnzH/RFWsFws1RS86njAExvYRAOgk DG44TClh7n5FIQE4tyC1A5xJozR0iSbqEamuFaHWhu/lwqKkRW87dNs9PPHxpIei byrgy/3pVa0wOLOkdvOme2dJhz0XVuycUryl3poadlPHPdjsWeFkwx4D1IxWdejQ = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h= content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; s= thelastpickle.com; bh=AgiPdoU8yfvzNOSYMOuPvGhGBlU=; b=llsNvUyTeE A7ZR4QyZJhARzrga2twMTtN7pJSzts3ZJxe0iEO+4sdO7opq7m8QJgzI257+RtxH Zdv8y0wXXyM0GfGO/yKvWO0Qy40TWCOKB75L8FvNywDJupEQjAaNZmTOovxZcspm oC087JbPhHWnFOiuyT4BjM79roGJdF/v4= Received: from [172.16.1.3] (125-236-193-159.adsl.xtra.co.nz [125.236.193.159]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a91.g.dreamhost.com (Postfix) with ESMTPSA id 13986AE05B for ; Sun, 8 Apr 2012 14:16:02 -0700 (PDT) Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Apple Message framework v1257) Subject: Re: Request timeout and host marked down From: aaron morton In-Reply-To: Date: Mon, 9 Apr 2012 09:15:59 +1200 Content-Transfer-Encoding: quoted-printable Message-Id: <16F5BF0E-9C56-4F23-AD1E-910AEDF72DF1@thelastpickle.com> References: To: user@cassandra.apache.org X-Mailer: Apple Mail (2.1257) You need to see if the timeout is from the client to the server, or = between the server nodes.=20 If it's server side a TimedOutException will be thrown from thrift. Take = a look at the nodetool tpstats on the servers, you will probably see = lots of "Pending" tasks. Basically the cluster is overloaded. Consider: * check the IO, CPU, GC state on the servers.=20 * ensuring the data and requests are evenly spread around the cluster.=20= * reducing the number of columns read in a select.=20 Cheers ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 6/04/2012, at 5:30 AM, Daning Wang wrote: > Hi all, >=20 > We are using Hector and ofter we see lots of timeout exception in the = log, I know that the hector can failover to other node, but I want to = reduce the number of timeouts. >=20 > any hector parameter I should change to reduce this error? >=20 > also, on the server side, any kind of tunning need to do for the = timeout? > =20 >=20 > Thanks in advance. >=20 >=20 > 12/04/04 15:13:20 ERROR = com.netseer.services.keywordstat.io.KeywordServiceImpl: Timout 10000 ms > 12/04/04 15:13:25 ERROR = me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS = DOWN TRIGGERED for host 10.28.78.123(10.28.78.123):9160 > 12/04/04 15:13:25 ERROR = me.prettyprint.cassandra.connection.HConnectionManager: Pool state on = shutdown: = :{10.28.78.123(10.28.78.123):9160}; = IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19 > 12/04/04 15:13:44 ERROR = me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS = DOWN TRIGGERED for host 10.240.113.171(10.240.113.171):9160 > 12/04/04 15:13:44 ERROR = me.prettyprint.cassandra.connection.HConnectionManager: Pool state on = shutdown: = :{10.240.113.171(10.240.113.171):9160= }; IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: = 19 > 12/04/04 15:13:46 ERROR = me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS = DOWN TRIGGERED for host 10.28.78.123(10.28.78.123):9160 > 12/04/04 15:13:46 ERROR = me.prettyprint.cassandra.connection.HConnectionManager: Pool state on = shutdown: = :{10.28.78.123(10.28.78.123):9160}; = IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19 > 12/04/04 15:13:46 ERROR = me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS = DOWN TRIGGERED for host 10.123.83.114(10.123.83.114):9160 > 12/04/04 15:13:46 ERROR = me.prettyprint.cassandra.connection.HConnectionManager: Pool state on = shutdown: = :{10.123.83.114(10.123.83.114):9160};= IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19 > 12/04/04 15:13:46 ERROR = me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS = DOWN TRIGGERED for host 10.6.115.239(10.6.115.239):9160 > 12/04/04 15:13:46 ERROR = me.prettyprint.cassandra.connection.HConnectionManager: Pool state on = shutdown: = :{10.6.115.239(10.6.115.239):9160}; = IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19 > 12/04/04 15:13:49 ERROR = com.netseer.services.keywordstat.io.KeywordServiceImpl: Timout 10000 ms > 12/04/04 15:13:49 ERROR = me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS = DOWN TRIGGERED for host 10.120.205.48(10.120.205.48):9160 > 12/04/04 15:13:49 ERROR = me.prettyprint.cassandra.connection.HConnectionManager: Pool state on = shutdown: = :{10.120.205.48(10.120.205.48):9160};= IsActive?: true; Active: 3; Blocked: 0; Idle: 3; NumBeforeExhausted: 17 > 12/04/04 15:13:50 ERROR = me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS = DOWN TRIGGERED for host 10.28.20.200(10.28.20.200):9160 > 12/04/04 15:13:50 ERROR = me.prettyprint.cassandra.connection.HConnectionManager: Pool state on = shutdown: = :{10.28.20.200(10.28.20.200):9160}; = IsActive?: true; Active: 2; Blocked: 0; Idle: 4; NumBeforeExhausted: 18 > 12/04/04 15:13:51 ERROR = com.netseer.services.keywordstat.io.KeywordServiceImpl: Timout 10000 ms