Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 656C32206 for ; Fri, 29 Apr 2011 03:49:00 +0000 (UTC) Received: (qmail 43003 invoked by uid 500); 29 Apr 2011 03:48:59 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 42879 invoked by uid 500); 29 Apr 2011 03:48:58 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 42868 invoked by uid 99); 29 Apr 2011 03:48:57 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 29 Apr 2011 03:48:57 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of saint.ack@gmail.com designates 209.85.216.169 as permitted sender) Received: from [209.85.216.169] (HELO mail-qy0-f169.google.com) (209.85.216.169) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 29 Apr 2011 03:48:49 +0000 Received: by qyk2 with SMTP id 2so84970qyk.14 for ; Thu, 28 Apr 2011 20:48:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=crlINg+TdYsOXRMUsQoMQ1fyZbDHn9p4kI8EGGwqK7U=; b=Dy/GtUBmiPkrWvb2oT5CpsJSgzo1rf39jEJQ+zm0VwA9VOTw/0OASdoDYfLTMyejDH A/kGRryagYpzSqp0N5Jaj0M4xfFOCqHqb9bYYmpuKivrM/jWntTuXj1dF5Mswb3CZZpc RbHI6VmDe/fd4tK8c61A5zM20PgSNQSep0ecE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=yBlZIufmhyueBj0V8fxyyVUpfuNp66380rHGsvIVtviIg8RCo7aAZlFWDeBaNckVLf z1++JoWJyNRxP2lvo8zzeeSZ/nBJE4WuQ2XWGNwXwzkj+828sbF1JxB+FZyUuUELgvym beU80qFNk0HYQz19IcOk70JYzrFxd0hLKkG2Q= MIME-Version: 1.0 Received: by 10.224.193.137 with SMTP id du9mr3646029qab.115.1304048908748; Thu, 28 Apr 2011 20:48:28 -0700 (PDT) Sender: saint.ack@gmail.com Received: by 10.224.29.6 with HTTP; Thu, 28 Apr 2011 20:48:28 -0700 (PDT) In-Reply-To: <30D4476CB62BAE4589FD646476C29CD2CD4EA5@SZXEML503-MBS.china.huawei.com> References: <30D4476CB62BAE4589FD646476C29CD2CD4CB3@SZXEML503-MBS.china.huawei.com> <30D4476CB62BAE4589FD646476C29CD2CD4CFF@SZXEML503-MBS.china.huawei.com> <30D4476CB62BAE4589FD646476C29CD2CD4EA5@SZXEML503-MBS.china.huawei.com> Date: Thu, 28 Apr 2011 20:48:28 -0700 X-Google-Sender-Auth: LiMFBz5KqkrM0nLOyd0zXN_tbto Message-ID: Subject: Re: found one deadlock on hbase? From: Stack To: user@hbase.apache.org Cc: Yanlijun , Chenjian Content-Type: text/plain; charset=GB2312 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Yes. The below looks viable (though strange we have not seen it up to this). The profiler may have slowed things to bring on the deadlock -- or the run up to the high water mark -- but its still a deadlock. Please file a critical priority issue. If you have a patch, that'd be excellent. Thanks for digging in on this, St.Ack 2011/4/28 Zhoushuaifeng : > Thanks, I will do more test. > Maybe the deadlock hapened like this? Please point it out if it's wrong. > > 1=A3=ACOne handler is handling put op, and reclaimMemStoreMemory, but the= memory is isAboveHighWaterMark, so this handler locked the memstoreflusher= until global mem is lower: > > public synchronized void reclaimMemStoreMemory() { > if (isAboveHighWaterMark()) { > lock.lock(); > try { > while (isAboveHighWaterMark() && !server.isStopped()) { > wakeupFlushThread(); > try { > // we should be able to wait forever, but we've seen a bug whe= re > // we miss a notify, so put a 5 second bound on it at least. > flushOccurred.await(5, TimeUnit.SECONDS); > } catch (InterruptedException ie) { > Thread.currentThread().interrupt(); > } > } > } finally { > lock.unlock(); > > 2, flushforGlobalPressure is trigered, but to flush the memstore, it need= ed to lock the memstoreflusher: > > private boolean flushRegion(final HRegion region, final boolean emergenc= yFlush) { > synchronized (this.regionsInQueue) { > FlushRegionEntry fqe =3D this.regionsInQueue.remove(region); > if (fqe !=3D null && emergencyFlush) { > // Need to remove from region from delay queue. When NOT an > // emergencyFlush, then item was removed via a flushQueue.poll. > flushQueue.remove(fqe); > } > lock.lock(); > } > > 3, because lock is locked by the ipchandler of put op, the flushRegion wi= ll never get the lock and flush will never happen. > 4, no flush, memory stay in AboveHighWaterMark state, and never unlock, s= o, deadlock happend. > > Is it right? > > Zhou Shuaifeng(Frank) > > > -----=D3=CA=BC=FE=D4=AD=BC=FE----- > =B7=A2=BC=FE=C8=CB: jdcryans@gmail.com [mailto:jdcryans@gmail.com] =B4=FA= =B1=ED Jean-Daniel Cryans > =B7=A2=CB=CD=CA=B1=BC=E4: 2011=C4=EA4=D4=C229=C8=D5 3:09 > =CA=D5=BC=FE=C8=CB: user@hbase.apache.org > =D6=F7=CC=E2: Re: found one deadlock on hbase? > > Like I said in the previous thread you made about this issue, it seems > that the YourKit profiler is doing something unexpected from the HBase > POV. Can you try running without it and see if it still happens? > > J-D > > 2011/4/28 Zhoushuaifeng : >> Thanks, version is 0.90.1 >> >> Zhou Shuaifeng(Frank) >> >> -----=D3=CA=BC=FE=D4=AD=BC=FE----- >> =B7=A2=BC=FE=C8=CB: saint.ack@gmail.com [mailto:saint.ack@gmail.com] =B4= =FA=B1=ED Stack >> =B7=A2=CB=CD=CA=B1=BC=E4: 2011=C4=EA4=D4=C228=C8=D5 13:10 >> =CA=D5=BC=FE=C8=CB: user@hbase.apache.org >> =B3=AD=CB=CD: Yanlijun >> =D6=F7=CC=E2: Re: found one deadlock on hbase? >> >> Must be a deadlock if the dumb JVM can figure it out. What version of >> hbase please so I can dig into source code? >> Thanks, >> St.Ack >> >> >