Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 324A7E46C for ; Fri, 1 Feb 2013 05:43:17 +0000 (UTC) Received: (qmail 31700 invoked by uid 500); 1 Feb 2013 05:43:16 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 31465 invoked by uid 500); 1 Feb 2013 05:43:16 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 31382 invoked by uid 99); 1 Feb 2013 05:43:14 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Feb 2013 05:43:14 +0000 Date: Fri, 1 Feb 2013 05:43:14 +0000 (UTC) From: "Anoop Sam John (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-7728) deadlock occurs between hlog roller and hlog syncer MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-7728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13568504#comment-13568504 ] Anoop Sam John commented on HBASE-7728: --------------------------------------- Ted In the last patch you removed the null check for the hlogFlush() retry? Now the check is there in one place in sync() , the retry time. We need the null check at the 1st time sync also. (with out the updateLock) > deadlock occurs between hlog roller and hlog syncer > --------------------------------------------------- > > Key: HBASE-7728 > URL: https://issues.apache.org/jira/browse/HBASE-7728 > Project: HBase > Issue Type: Bug > Components: wal > Affects Versions: 0.94.2 > Environment: Linux 2.6.18-164.el5 x86_64 GNU/Linux > Reporter: Wang Qiang > Assignee: Ted Yu > Priority: Blocker > Fix For: 0.96.0, 0.94.5 > > Attachments: 7728-0.94.txt, 7728-suggest-0.96.txt, 7728-suggest.txt, 7728-v1.txt, 7728-v2.txt, 7728-v3.txt, 7728-v4.txt > > > the hlog roller thread and hlog syncer thread may occur dead lock with the 'flushLock' and 'updateLock', and then cause all 'IPC Server handler' thread blocked on hlog append. the jstack info is as follow : > "regionserver60020.logRoller": > at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1305) > - waiting to lock <0x000000067bf88d58> (a java.lang.Object) > at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1283) > at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1456) > at org.apache.hadoop.hbase.regionserver.wal.HLog.cleanupCurrentWriter(HLog.java:876) > at org.apache.hadoop.hbase.regionserver.wal.HLog.rollWriter(HLog.java:657) > - locked <0x000000067d54ace0> (a java.lang.Object) > at org.apache.hadoop.hbase.regionserver.LogRoller.run(LogRoller.java:94) > at java.lang.Thread.run(Thread.java:662) > "regionserver60020.logSyncer": > at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1314) > - waiting to lock <0x000000067d54ace0> (a java.lang.Object) > - locked <0x000000067bf88d58> (a java.lang.Object) > at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1283) > at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1456) > at org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.run(HLog.java:1235) > at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira