Return-Path: X-Original-To: apmail-asterixdb-notifications-archive@minotaur.apache.org Delivered-To: apmail-asterixdb-notifications-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 903D218F5C for ; Wed, 11 Nov 2015 22:13:13 +0000 (UTC) Received: (qmail 62484 invoked by uid 500); 11 Nov 2015 22:13:13 -0000 Delivered-To: apmail-asterixdb-notifications-archive@asterixdb.apache.org Received: (qmail 62453 invoked by uid 500); 11 Nov 2015 22:13:13 -0000 Mailing-List: contact notifications-help@asterixdb.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@asterixdb.incubator.apache.org Delivered-To: mailing list notifications@asterixdb.incubator.apache.org Received: (qmail 62444 invoked by uid 99); 11 Nov 2015 22:13:13 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Nov 2015 22:13:13 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id D1DB418022F for ; Wed, 11 Nov 2015 22:13:12 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.97 X-Spam-Level: X-Spam-Status: No, score=0.97 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, T_RP_MATCHES_RCVD=-0.01] autolearn=disabled Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id bYTjvmUriRtE for ; Wed, 11 Nov 2015 22:13:11 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with SMTP id 65E012386A for ; Wed, 11 Nov 2015 22:13:11 +0000 (UTC) Received: (qmail 62255 invoked by uid 99); 11 Nov 2015 22:13:11 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Nov 2015 22:13:11 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 117D32C1F5D for ; Wed, 11 Nov 2015 22:13:11 +0000 (UTC) Date: Wed, 11 Nov 2015 22:13:11 +0000 (UTC) From: "Ian Maxon (JIRA)" To: notifications@asterixdb.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (ASTERIXDB-1170) Deadlock in shutdown with DatasetLifecycleManager MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ASTERIXDB-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ian Maxon updated ASTERIXDB-1170: --------------------------------- Attachment: trace.txt > Deadlock in shutdown with DatasetLifecycleManager > ------------------------------------------------- > > Key: ASTERIXDB-1170 > URL: https://issues.apache.org/jira/browse/ASTERIXDB-1170 > Project: Apache AsterixDB > Issue Type: Bug > Reporter: Ian Maxon > Attachments: trace.txt > > > During cancel of a test run, I observed this deadlock in the DatasetLifeCycleManager. It looks like the checkpoint thread is holding the optracker but needs the monitor on the DatasetLifeCycleManager, and the DatasetLifecycleManager needs the converse. This in turn, prevents clean shutdown. > "org.apache.hyracks.api.rewriter.runtime.SuperActivity:TAID:TID:ANID:ODID:0:0:2:0:0@5996" daemon prio=5 tid=0x74 nid=NA waiting for monitor entry > java.lang.Thread.State: BLOCKED > blocks org.apache.hyracks.api.rewriter.runtime.SuperActivity:TAID:TID:ANID:ODID:0:0:3:0:0@5995 > waiting for org.apache.hyracks.api.rewriter.runtime.SuperActivity:TAID:TID:ANID:ODID:0:0:3:0:0@5995 to release lock on <0x17dc> (a org.apache.asterix.common.context.DatasetLifecycleManager) > at org.apache.asterix.common.context.DatasetLifecycleManager.allocateDatasetMemory(DatasetLifecycleManager.java:639) > at org.apache.asterix.common.context.PrimaryIndexOperationTracker.beforeOperation(PrimaryIndexOperationTracker.java:64) > at org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.enterComponents(LSMHarness.java:180) > at org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.getAndEnterComponents(LSMHarness.java:115) > - locked <0x17dd> (a org.apache.asterix.common.context.PrimaryIndexOperationTracker) > at org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.modify(LSMHarness.java:333) > at org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.modify(LSMHarness.java:327) > at org.apache.hyracks.storage.am.lsm.common.impls.LSMTreeIndexAccessor.insert(LSMTreeIndexAccessor.java:50) > at org.apache.asterix.common.dataflow.AsterixLSMInsertDeleteOperatorNodePushable.nextFrame(AsterixLSMInsertDeleteOperatorNodePushable.java:102) > at org.apache.hyracks.control.nc.Task.pushFrames(Task.java:342) > at org.apache.hyracks.control.nc.Task.run(Task.java:290) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > "org.apache.hyracks.api.rewriter.runtime.SuperActivity:TAID:TID:ANID:ODID:0:0:3:0:0@5995" daemon prio=5 tid=0x77 nid=NA waiting for monitor entry > java.lang.Thread.State: BLOCKED > blocks Thread-55@5983 > blocks org.apache.hyracks.api.rewriter.runtime.SuperActivity:TAID:TID:ANID:ODID:0:0:2:0:0@5996 > waiting for org.apache.hyracks.api.rewriter.runtime.SuperActivity:TAID:TID:ANID:ODID:0:0:2:0:0@5996 to release lock on <0x17dd> (a org.apache.asterix.common.context.PrimaryIndexOperationTracker) > at org.apache.asterix.common.context.DatasetLifecycleManager.open(DatasetLifecycleManager.java:205) > - locked <0x17dc> (a org.apache.asterix.common.context.DatasetLifecycleManager) > at org.apache.hyracks.storage.am.common.dataflow.IndexDataflowHelper.open(IndexDataflowHelper.java:116) > at org.apache.asterix.common.dataflow.AsterixLSMInsertDeleteOperatorNodePushable.open(AsterixLSMInsertDeleteOperatorNodePushable.java:61) > at org.apache.hyracks.control.nc.Task.pushFrames(Task.java:334) > at org.apache.hyracks.control.nc.Task.run(Task.java:290) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > "Thread-55@5983" prio=5 tid=0x5a nid=NA waiting for monitor entry > java.lang.Thread.State: BLOCKED > waiting for org.apache.hyracks.api.rewriter.runtime.SuperActivity:TAID:TID:ANID:ODID:0:0:3:0:0@5995 to release lock on <0x17dc> (a org.apache.asterix.common.context.DatasetLifecycleManager) > at org.apache.asterix.common.context.DatasetLifecycleManager.flushAllDatasets(DatasetLifecycleManager.java:474) > at org.apache.asterix.transaction.management.service.recovery.RecoveryManager.checkpoint(RecoveryManager.java:406) > - locked <0x17f5> (a org.apache.asterix.transaction.management.service.recovery.RecoveryManager) > at org.apache.asterix.hyracks.bootstrap.NCApplicationEntryPoint.stop(NCApplicationEntryPoint.java:132) > at org.apache.hyracks.control.nc.NodeControllerService.stop(NodeControllerService.java:347) > - locked <0x17f7> (a org.apache.hyracks.control.nc.NodeControllerService) > at org.apache.hyracks.control.nc.NodeControllerService$JVMShutdownHook.run(NodeControllerService.java:588) -- This message was sent by Atlassian JIRA (v6.3.4#6332)