asterixdb-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Young-Seok Kim (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ASTERIXDB-1450) Transaction log file not found on recovery intermittent hang on integration test
Date Tue, 17 May 2016 22:16:12 GMT

    [ https://issues.apache.org/jira/browse/ASTERIXDB-1450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15287741#comment-15287741
] 

Young-Seok Kim commented on ASTERIXDB-1450:
-------------------------------------------

This issue could be caused by the checkpoint thread. 
If the checkpoint thread removed a log file which includes the first LSN's log record from
the Job to be aborted, when the recovery manager aborts the job, the file including the aborted
job's first LSN's log record may not exist. 
So, if this is the cause of the issue, I think what should be done is as follows:
The checkpoint thread (or some other component) should provide the LSN of the first valid
log record in the log files. 
So, the LSN of the first valid log record should be used as a starting LSN to be read for
the abort.

> Transaction log file not found on recovery intermittent hang on integration test
> --------------------------------------------------------------------------------
>
>                 Key: ASTERIXDB-1450
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1450
>             Project: Apache AsterixDB
>          Issue Type: Bug
>            Reporter: Michael Blow
>            Assignee: Murtadha Hubail
>
> See https://asterix-jenkins.ics.uci.edu/job/asterix-coverage/99/artifact/asterixdb/asterix-installer/target/asterix-installer-0.8.9-SNAPSHOT-binary-assembly/clusters/local/working_dir/logs/asterix_nc2.log
> INFO: { lock : 1, instantLock : 0, tryLock : 13133, instantTryLock : 46159, unlock :
13134, releaseLocks : 2511 }
> Exception in thread "Thread-1" java.lang.Error: org.apache.asterix.common.exceptions.ACIDException:
Could not complete rollback! System is in an inconsistent state
> 	at org.apache.asterix.runtime.job.listener.JobEventListenerFactory$1.jobletFinish(JobEventListenerFactory.java:61)
> 	at org.apache.hyracks.control.nc.Joblet.performCleanup(Joblet.java:317)
> 	at org.apache.hyracks.control.nc.Joblet.removeTask(Joblet.java:153)
> 	at org.apache.hyracks.control.nc.work.NotifyTaskFailureWork.run(NotifyTaskFailureWork.java:54)
> 	at org.apache.hyracks.control.common.work.WorkQueue$WorkerThread.run(WorkQueue.java:132)
> Caused by: org.apache.asterix.common.exceptions.ACIDException: Could not complete rollback!
System is in an inconsistent state
> 	at org.apache.asterix.transaction.management.service.transaction.TransactionManager.abortTransaction(TransactionManager.java:72)
> 	at org.apache.asterix.transaction.management.service.transaction.TransactionManager.completedTransaction(TransactionManager.java:130)
> 	at org.apache.asterix.runtime.job.listener.JobEventListenerFactory$1.jobletFinish(JobEventListenerFactory.java:58)
> 	... 4 more
> Caused by: java.lang.IllegalStateException
> 	at org.apache.asterix.transaction.management.service.logging.LogManager.getFileChannel(LogManager.java:449)
> 	at org.apache.asterix.transaction.management.service.logging.LogReader.getFileChannel(LogReader.java:276)
> 	at org.apache.asterix.transaction.management.service.logging.LogReader.initializeScan(LogReader.java:74)
> 	at org.apache.asterix.transaction.management.service.recovery.RecoveryManager.rollbackTransaction(RecoveryManager.java:717)
> 	at org.apache.asterix.transaction.management.service.transaction.TransactionManager.abortTransaction(TransactionManager.java:64)
> 	... 6 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message