Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 0C8A92009D9 for ; Wed, 18 May 2016 00:52:18 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 0B0C7160A1F; Tue, 17 May 2016 22:52:18 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 514D71609F5 for ; Wed, 18 May 2016 00:52:17 +0200 (CEST) Received: (qmail 10276 invoked by uid 500); 17 May 2016 22:52:16 -0000 Mailing-List: contact notifications-help@asterixdb.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@asterixdb.incubator.apache.org Delivered-To: mailing list notifications@asterixdb.incubator.apache.org Received: (qmail 10267 invoked by uid 99); 17 May 2016 22:52:16 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 May 2016 22:52:16 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 24542C2D9D for ; Tue, 17 May 2016 22:52:16 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -4.646 X-Spam-Level: X-Spam-Status: No, score=-4.646 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.426] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id gdLoEcWvc16i for ; Tue, 17 May 2016 22:52:14 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with SMTP id A97955F1E7 for ; Tue, 17 May 2016 22:52:13 +0000 (UTC) Received: (qmail 10157 invoked by uid 99); 17 May 2016 22:52:13 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 May 2016 22:52:13 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id CA6232C14E1 for ; Tue, 17 May 2016 22:52:12 +0000 (UTC) Date: Tue, 17 May 2016 22:52:12 +0000 (UTC) From: "Murtadha Hubail (JIRA)" To: notifications@asterixdb.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ASTERIXDB-1450) Transaction log file not found on recovery intermittent hang on integration test MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 17 May 2016 22:52:18 -0000 [ https://issues.apache.org/jira/browse/ASTERIXDB-1450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15287808#comment-15287808 ] Murtadha Hubail commented on ASTERIXDB-1450: -------------------------------------------- I already thought about this and started working on it, however, I hit another issue. When a log file partition is closed because it cannot fit the current log record, the append LSN in the log manager jumps to the LSN of the beginning of the next log file partition, however, the flush LSN (last flushed log record on disk) is not updated properly. This causes a deadlock during a rollback operation since the flush LSN will always be smaller then the aborted job last log record. Yingyi faced another issue where an invalid LSN was provided to the log reader (ASTERIXDB-1425) > Transaction log file not found on recovery intermittent hang on integration test > -------------------------------------------------------------------------------- > > Key: ASTERIXDB-1450 > URL: https://issues.apache.org/jira/browse/ASTERIXDB-1450 > Project: Apache AsterixDB > Issue Type: Bug > Reporter: Michael Blow > Assignee: Murtadha Hubail > > See https://asterix-jenkins.ics.uci.edu/job/asterix-coverage/99/artifact/asterixdb/asterix-installer/target/asterix-installer-0.8.9-SNAPSHOT-binary-assembly/clusters/local/working_dir/logs/asterix_nc2.log > INFO: { lock : 1, instantLock : 0, tryLock : 13133, instantTryLock : 46159, unlock : 13134, releaseLocks : 2511 } > Exception in thread "Thread-1" java.lang.Error: org.apache.asterix.common.exceptions.ACIDException: Could not complete rollback! System is in an inconsistent state > at org.apache.asterix.runtime.job.listener.JobEventListenerFactory$1.jobletFinish(JobEventListenerFactory.java:61) > at org.apache.hyracks.control.nc.Joblet.performCleanup(Joblet.java:317) > at org.apache.hyracks.control.nc.Joblet.removeTask(Joblet.java:153) > at org.apache.hyracks.control.nc.work.NotifyTaskFailureWork.run(NotifyTaskFailureWork.java:54) > at org.apache.hyracks.control.common.work.WorkQueue$WorkerThread.run(WorkQueue.java:132) > Caused by: org.apache.asterix.common.exceptions.ACIDException: Could not complete rollback! System is in an inconsistent state > at org.apache.asterix.transaction.management.service.transaction.TransactionManager.abortTransaction(TransactionManager.java:72) > at org.apache.asterix.transaction.management.service.transaction.TransactionManager.completedTransaction(TransactionManager.java:130) > at org.apache.asterix.runtime.job.listener.JobEventListenerFactory$1.jobletFinish(JobEventListenerFactory.java:58) > ... 4 more > Caused by: java.lang.IllegalStateException > at org.apache.asterix.transaction.management.service.logging.LogManager.getFileChannel(LogManager.java:449) > at org.apache.asterix.transaction.management.service.logging.LogReader.getFileChannel(LogReader.java:276) > at org.apache.asterix.transaction.management.service.logging.LogReader.initializeScan(LogReader.java:74) > at org.apache.asterix.transaction.management.service.recovery.RecoveryManager.rollbackTransaction(RecoveryManager.java:717) > at org.apache.asterix.transaction.management.service.transaction.TransactionManager.abortTransaction(TransactionManager.java:64) > ... 6 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)