Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1F35F9F3C for ; Tue, 24 Apr 2012 14:03:30 +0000 (UTC) Received: (qmail 49394 invoked by uid 500); 24 Apr 2012 14:03:27 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 49370 invoked by uid 500); 24 Apr 2012 14:03:27 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 49361 invoked by uid 99); 24 Apr 2012 14:03:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Apr 2012 14:03:27 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of hosschao@gmail.com designates 209.85.213.172 as permitted sender) Received: from [209.85.213.172] (HELO mail-yx0-f172.google.com) (209.85.213.172) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Apr 2012 14:03:21 +0000 Received: by yenm5 with SMTP id m5so418447yen.31 for ; Tue, 24 Apr 2012 07:03:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=JeIUmePxSpswcjpsyBECbFvshA45tOCs1qwGN+toX5A=; b=Yqzeh9LazocYptCj3SsHGOVDv70dMg5secWgi3qSifkfaEv0wAmh0BHCsMl9AwvE7n aajyPAEN1rttNi8un5JuSVIM1TX+30ej+W3GrwGgntedK5JChufFpd8MRoLUtIshbWx2 wMDOwpjW8ds8aahH+GST6RTYlYW89Qonx1wFKbl0vC9uGG1ikRZy9b/USXXeCUzTr+Zg I/YLaYufSiB67OGkH4U6L7ilRNhU9y7Z4nkmlQGcgcQSHsIfjjKpCyMexBHJtK3v/WW6 bpg+sHtzCN3mbEJw9MK/3IqkCpU6KOhHK8fvmkOeINecewg+J7R/YopWU6qNcPYJ2vyL Rd+A== MIME-Version: 1.0 Received: by 10.50.47.230 with SMTP id g6mr11411849ign.74.1335276180064; Tue, 24 Apr 2012 07:03:00 -0700 (PDT) Received: by 10.231.61.9 with HTTP; Tue, 24 Apr 2012 07:02:59 -0700 (PDT) In-Reply-To: References: Date: Tue, 24 Apr 2012 22:03:00 +0800 Message-ID: Subject: Re: High latency of cassandra From: =?GB2312?B?wu2zrA==?= To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=14dae9340a3763cae504be6d3675 X-Virus-Checked: Checked by ClamAV on apache.org --14dae9340a3763cae504be6d3675 Content-Type: text/plain; charset=GB2312 Content-Transfer-Encoding: quoted-printable Thank you very much for your replay~ The full message like this: DEBUG [Thread-6] 2012-04-24 21:04:11,024 IncomingTcpConnection.java (line 116) Version is now 3 During "blocking time" I only saw this message(*appendix shown) *and seems everything is blocked. I logged all cassandra calling time in RPC and there is no large delay but RPC itself delay a lot. I also using JConsole to monitor the GC time (both cassandra and RPC server) and didn't find large GC delay. Is my OS blocked during that time? I also list tpstats as well as compactionstats in the Appendix. Are there something wrong with my cassandra? Thank you very much for your help again :) *Appendix:* *cassandra debug output block:* DEBUG [Thread-6] 2012-04-24 21:04:11,024 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-7] 2012-04-24 21:04:11,430 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-7] 2012-04-24 21:04:11,431 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-4] 2012-04-24 21:04:11,901 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-4] 2012-04-24 21:04:11,902 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-4] 2012-04-24 21:04:12,024 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-7] 2012-04-24 21:04:12,431 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-7] 2012-04-24 21:04:12,431 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-6] 2012-04-24 21:04:13,024 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-6] 2012-04-24 21:04:13,750 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-6] 2012-04-24 21:04:13,751 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-7] 2012-04-24 21:04:14,025 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-4] 2012-04-24 21:04:14,902 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-4] 2012-04-24 21:04:14,902 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-6] 2012-04-24 21:04:15,025 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-6] 2012-04-24 21:04:15,751 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-6] 2012-04-24 21:04:15,751 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-6] 2012-04-24 21:04:16,025 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-7] 2012-04-24 21:04:16,431 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-7] 2012-04-24 21:04:16,432 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-4] 2012-04-24 21:04:16,902 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-4] 2012-04-24 21:04:16,903 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-7] 2012-04-24 21:04:17,025 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-6] 2012-04-24 21:04:18,025 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-4] 2012-04-24 21:04:18,902 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-4] 2012-04-24 21:04:18,903 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-7] 2012-04-24 21:04:19,026 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-6] 2012-04-24 21:04:19,752 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-6] 2012-04-24 21:04:19,752 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-4] 2012-04-24 21:04:19,903 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-4] 2012-04-24 21:04:19,903 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-6] 2012-04-24 21:04:20,026 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-6] 2012-04-24 21:04:20,752 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-6] 2012-04-24 21:04:20,752 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-7] 2012-04-24 21:04:21,026 IncomingTcpConnection.java (line 116) Version is now 3 DEBUG [Thread-4] 2012-04-24 21:04:22,026 IncomingTcpConnection.java (line 116) Version is now 3 *tpstats:* Pool Name Active Pending Completed Blocked All time blocked ReadStage 0 0 171002 0 0 RequestResponseStage 0 0 1016620 0 0 MutationStage 0 0 278581 0 0 ReadRepairStage 0 0 0 0 0 ReplicateOnWriteStage 0 0 0 0 0 GossipStage 0 0 1123513 0 0 AntiEntropyStage 0 0 0 0 0 MigrationStage 0 0 0 0 0 MemtablePostFlusher 0 0 23 0 0 StreamStage 0 0 0 0 0 FlushWriter 0 0 23 0 0 MiscStage 0 0 0 0 0 InternalResponseStage 0 0 0 0 0 HintedHandoff 0 0 0 0 0 Message type Dropped RANGE_SLICE 0 READ_REPAIR 0 BINARY 0 READ 0 MUTATION 0 REQUEST_RESPONSE 0 *compactionstats:* pending tasks: 0 2012/4/24 Maki Watanabe > If you set trace level for IncomingTCPConnection, the message "Version > is now ..." will be printed for every inter-cassandra message received > by the node, including Gossip. > Enabling this log in high traffic will saturate IO for your log disk by > itself. > > You should better to inspect nodetool tpstats, compactionstats at first. > > maki > > 2012/4/24 =C2=ED=B3=AC : > > Does any one have idea for this? Thanks~ > > > > > > 2012/4/24 =C2=ED=B3=AC > >> > >> Hi all, > >> > >> I have some troubles of cassandra in my production: > >> > >> I build up a RPC server which using hector client to manipulate the > >> cassandra. Wired things happen nowadays: the latency of RPC sometimes > >> becames very high (10seconds~70seconds) in several minutes and reduce > >> to normal level (30ms in average) after that time. I investigate the > >> debug log of cassandra. During high latency time, the cassandra output > >> lots of message like: > >> "IncomingTcpConnection.java(116) Version is now 3. " > >> Seems everything be blocked during that time. > >> > >> Our settings as following: > >> The version of cassandra is 1.0.1 and hector version is 0.7.0 for the > >> compatible of thrift version which we use (0.5.0) > >> The cluster contains 4 nodes and all of them are seeds. The > >> gcgraceseconds is 0 since we needn't delete the data > >> > >> p.s. It works well for a long time (3 months) but becames crazy these > >> days after we push the new RPC server which supports bigger data > >> saving (2mb in average). I'm not sure if these is the reason. > >> > >> Hope getting your replay~~ > >> > >> Thanks, > >> > >> Chao. > > > > > --14dae9340a3763cae504be6d3675 Content-Type: text/html; charset=GB2312 Content-Transfer-Encoding: quoted-printable
Thank you very much for your replay~

The full message = like this:
DEBUG [Thread-6] 2012-04-24 21:0= 4:11,024 IncomingTcpConnection.java (line 116) Version is now 3

During &quo= t;blocking time" I only saw this message(appendix shown) a= nd seems everything is blocked.

I logged all cassandra calling time in RPC and there is no large delay but = RPC itself delay a lot.
I also using JConso= le to monitor the GC time (both cassandra and RPC server) and didn't fi= nd large GC delay.
Is my OS blocked during that time? 

I also list t= pstats as well as compactionstats in the Appendix. Are there something wron= g with my cassandra?

Thank you v= ery much for your help again :)

<= div class=3D"gmail_extra">Appendix:
cassandra debug output block:

DEBUG [Thread-6] 2012-04-24 21:04:11,024 IncomingTcpCon= nection.java (line 116) Version is now 3
DEBUG [Thread-7] 2012-04-24 21:04:11,430 Incomin= gTcpConnection.java (line 116) Version is now 3
DEBUG [Thread-7] 2012-04-24 21:04:11,431 IncomingTcpConnection.java (l= ine 116) Version is now 3
DEBUG [Thread-4] 2012-04-24 21:04:11,901 Incomin= gTcpConnection.java (line 116) Version is now 3
DEBUG [Thread-4] 2012-04-24 21:04:11,902 IncomingTcpConnection.java (l= ine 116) Version is now 3
DEBUG [Thread-4] 2012-04-24 21:04:12,024 Incomin= gTcpConnection.java (line 116) Version is now 3
DEBUG [Thread-7] 2012-04-24 21:04:12,431 IncomingTcpConnection.java (l= ine 116) Version is now 3
DEBUG [Thread-7] 2012-04-24 21:04:12,431 Incomin= gTcpConnection.java (line 116) Version is now 3
DEBUG [Thread-6] 2012-04-24 21:04:13,024 IncomingTcpConnection.java (l= ine 116) Version is now 3
DEBUG [Thread-6] 2012-04-24 21:04:13,750 Incomin= gTcpConnection.java (line 116) Version is now 3
DEBUG [Thread-6] 2012-04-24 21:04:13,751 IncomingTcpConnection.java (l= ine 116) Version is now 3
DEBUG [Thread-7] 2012-04-24 21:04:14,025 Incomin= gTcpConnection.java (line 116) Version is now 3
DEBUG [Thread-4] 2012-04-24 21:04:14,902 IncomingTcpConnection.java (l= ine 116) Version is now 3
DEBUG [Thread-4] 2012-04-24 21:04:14,902 Incomin= gTcpConnection.java (line 116) Version is now 3
DEBUG [Thread-6] 2012-04-24 21:04:15,025 IncomingTcpConnection.java (l= ine 116) Version is now 3
DEBUG [Thread-6] 2012-04-24 21:04:15,751 Incomin= gTcpConnection.java (line 116) Version is now 3
DEBUG [Thread-6] 2012-04-24 21:04:15,751 IncomingTcpConnection.java (l= ine 116) Version is now 3
DEBUG [Thread-6] 2012-04-24 21:04:16,025 Incomin= gTcpConnection.java (line 116) Version is now 3
DEBUG [Thread-7] 2012-04-24 21:04:16,431 IncomingTcpConnection.java (l= ine 116) Version is now 3
DEBUG [Thread-7] 2012-04-24 21:04:16,432 Incomin= gTcpConnection.java (line 116) Version is now 3
DEBUG [Thread-4] 2012-04-24 21:04:16,902 IncomingTcpConnection.java (l= ine 116) Version is now 3
DEBUG [Thread-4] 2012-04-24 21:04:16,903 Incomin= gTcpConnection.java (line 116) Version is now 3
DEBUG [Thread-7] 2012-04-24 21:04:17,025 IncomingTcpConnection.java (l= ine 116) Version is now 3
DEBUG [Thread-6] 2012-04-24 21:04:18,025 Incomin= gTcpConnection.java (line 116) Version is now 3
DEBUG [Thread-4] 2012-04-24 21:04:18,902 IncomingTcpConnection.java (l= ine 116) Version is now 3
DEBUG [Thread-4] 2012-04-24 21:04:18,903 Incomin= gTcpConnection.java (line 116) Version is now 3
DEBUG [Thread-7] 2012-04-24 21:04:19,026 IncomingTcpConnection.java (l= ine 116) Version is now 3
DEBUG [Thread-6] 2012-04-24 21:04:19,752 Incomin= gTcpConnection.java (line 116) Version is now 3
DEBUG [Thread-6] 2012-04-24 21:04:19,752 IncomingTcpConnection.java (l= ine 116) Version is now 3
DEBUG [Thread-4] 2012-04-24 21:04:19,903 Incomin= gTcpConnection.java (line 116) Version is now 3
DEBUG [Thread-4] 2012-04-24 21:04:19,903 IncomingTcpConnection.java (l= ine 116) Version is now 3
DEBUG [Thread-6] 2012-04-24 21:04:20,026 Incomin= gTcpConnection.java (line 116) Version is now 3
DEBUG [Thread-6] 2012-04-24 21:04:20,752 IncomingTcpConnection.java (l= ine 116) Version is now 3
DEBUG [Thread-6] 2012-04-24 21:04:20,752 Incomin= gTcpConnection.java (line 116) Version is now 3
DEBUG [Thread-7] 2012-04-24 21:04:21,026 IncomingTcpConnection.java (l= ine 116) Version is now 3
DEBUG [Thread-4] 2012-04-24 21:04:22,026 Incomin= gTcpConnection.java (line 116) Version is now 3

tpstats:

Pool Name   &nbs= p;                Active   Pen= ding      Completed   Blocked  All time blocked
ReadStage           =               0       &nb= sp; 0         171002         0 &nbs= p;               0
RequestResponseStage      0  = ;       0        1016620     &= nbsp;   0                 0
MutationStage         &nb= sp;          0         0  = ;       278581         0     &= nbsp;           0
ReadRepairStage         &nbs= p;      0         0      =        0         0     &= nbsp;           0
= ReplicateOnWriteStage        0       &nb= sp; 0              0      = ;   0                 0
GossipStage           &= nbsp;           0         0 &n= bsp;      1123513         0    = ;             0
AntiEntropyStage                = 0         0             &= nbsp;0         0           &nb= sp;     0
MigrationStage          = ;          0         0   =            0         0 &n= bsp;               0
MemtablePostFlusher          0  =       0             23  =       0               &n= bsp; 0
StreamStage           &= nbsp;           0         0 &n= bsp;            0        = 0                 0
FlushWriter             &nb= sp;           0         0 &nbs= p;           23         0 &nbs= p;               0
MiscStage           &nb= sp;               0      =   0              0     &= nbsp;   0                 0
InternalResponseStage       &n= bsp; 0         0           &nb= sp;  0         0          = ;       0
HintedHandoff          =           0         0   =            0         0 &n= bsp;               0

Message type    = ;       Dropped
RANGE_SLICE                  0=
READ_REPAIR         &n= bsp;        0
BINARY &n= bsp;                     =       0
READ     &= nbsp;                    =      0
MUTATION           &nbs= p;            0
REQUEST_RESPONSE     0


compactionstats:<= /div>

pending tas= ks: 0

=

<= br>

2012/4/24 Maki Watanabe &l= t;watanabe.mak= i@gmail.com>
If you set trace level for IncomingTCPConnection, the message "Version=
is now ..." will be printed for every inter-cassandra message received=
by the node, including Gossip.
Enabling this log in high traffic will saturate IO for your log disk by its= elf.

You should better to inspect nodetool tpstats, compactionstats at first.
maki

2012/4/24 =C2=ED=B3=AC <hosschao@g= mail.com>:
> Does any one have idea for thi= s? Thanks~
>
>
> 2012/4/24 =C2=ED=B3=AC <hossc= hao@gmail.com>
>>
>> Hi all,
>>
>> I have some troubles of cassandra in my production:
>>
>> I build up a RPC server which using hector client to manipulate th= e
>> cassandra. Wired things happen nowadays: the latency of RPC someti= mes
>> becames very high (10seconds~70seconds) in several minutes and red= uce
>> to normal level (30ms in average) after that time. I investigate t= he
>> debug log of cassandra. During high latency time, the cassandra ou= tput
>> lots of message like:
>> "IncomingTcpConnection.java(116) Version is now 3. "
>> Seems everything be blocked during that time.
>>
>> Our settings as following:
>> The version of cassandra is 1.0.1 and hector version is 0.7.0 for = the
>> compatible of thrift version which we use (0.5.0)
>> The cluster contains 4 nodes and all of them are seeds. The
>> gcgraceseconds is 0 since we needn't delete the data
>>
>> p.s. It works well for a long time (3 months) but becames crazy th= ese
>> days after we push the new RPC server which supports bigger data >> saving (2mb in average). I'm not sure if these is the reason.<= br> >>
>> Hope getting your replay~~
>>
>> Thanks,
>>
>> Chao.
>
>

--14dae9340a3763cae504be6d3675--