asterixdb-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Abdullah Alamoudi (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (ASTERIXDB-2081) Failed to restart after hit an OOM issue
Date Wed, 06 Sep 2017 16:18:00 GMT

     [ https://issues.apache.org/jira/browse/ASTERIXDB-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Abdullah Alamoudi reassigned ASTERIXDB-2081:
--------------------------------------------

    Assignee: Murtadha Hubail  (was: Ian Maxon)

> Failed to restart after hit an OOM issue
> ----------------------------------------
>
>                 Key: ASTERIXDB-2081
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2081
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: STO - Storage
>         Environment: master
>            Reporter: Jianfeng Jia
>            Assignee: Murtadha Hubail
>
> One of the node was failed due to the OOM error. Then when we try to restart the service,
the node couldn't be recovered and the logs is shown as below:
> {code}
> WARNING: Error in application message delivery!
> java.lang.IllegalStateException: Failed to redo
>     at org.apache.asterix.app.nc.RecoveryManager.redo(RecoveryManager.java:712)
>     at org.apache.asterix.app.nc.RecoveryManager.startRecoveryRedoPhase(RecoveryManager.java:378)
>     at org.apache.asterix.app.nc.RecoveryManager.replayPartitionsLogs(RecoveryManager.java:187)
>     at org.apache.asterix.app.nc.RecoveryManager.startLocalRecovery(RecoveryManager.java:179)
>     at org.apache.asterix.app.nc.task.LocalRecoveryTask.perform(LocalRecoveryTask.java:43)
>     at org.apache.asterix.app.replication.message.StartupTaskResponseMessage.handle(StartupTaskResponseMessage.java:53)
>     at org.apache.asterix.messaging.NCMessageBroker.receivedMessage(NCMessageBroker.java:92)
>     at org.apache.hyracks.control.nc.work.ApplicationMessageWork.run(ApplicationMessageWork.java:54)
>     at org.apache.hyracks.control.common.work.WorkQueue$WorkerThread.run(WorkQueue.java:127)
> Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: Cannot allocate dataset
245 memory since memory budget would be exceeded.
>     at org.apache.asterix.common.context.DatasetLifecycleManager.allocateMemory(DatasetLifecycleManager.java:566)
>     at org.apache.hyracks.storage.common.buffercache.ResourceHeapBufferAllocator.reserveAllocation(ResourceHeapBufferAllocator.java:53)
>     at org.apache.hyracks.storage.am.lsm.common.impls.VirtualBufferCache.open(VirtualBufferCache.java:307)
>     at org.apache.hyracks.storage.am.lsm.common.impls.MultitenantVirtualBufferCache.open(MultitenantVirtualBufferCache.java:119)
>     at org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTree.allocateMemoryComponent(LSMBTree.java:602)
>     at org.apache.hyracks.storage.am.lsm.common.impls.AbstractLSMIndex.allocateMemoryComponents(AbstractLSMIndex.java:386)
>     at org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.modify(LSMHarness.java:417)
>     at org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.forceModify(LSMHarness.java:364)
>     at org.apache.hyracks.storage.am.lsm.common.impls.LSMTreeIndexAccessor.forceUpsert(LSMTreeIndexAccessor.java:181)
>     at org.apache.asterix.app.nc.RecoveryManager.redo(RecoveryManager.java:707)
>     ... 8 more
> Sep 05, 2017 3:37:46 PM org.apache.hyracks.control.common.work.WorkQueue$WorkerThread
run
> WARNING: Exception while executing ApplicationMessage: nodeID: 4
> java.lang.RuntimeException: java.lang.IllegalStateException: Failed to redo
>     at org.apache.hyracks.control.nc.work.ApplicationMessageWork.run(ApplicationMessageWork.java:60)
>     at org.apache.hyracks.control.common.work.WorkQueue$WorkerThread.run(WorkQueue.java:127)
> Caused by: java.lang.IllegalStateException: Failed to redo
>     at org.apache.asterix.app.nc.RecoveryManager.redo(RecoveryManager.java:712)
>     at org.apache.asterix.app.nc.RecoveryManager.startRecoveryRedoPhase(RecoveryManager.java:378)
>     at org.apache.asterix.app.nc.RecoveryManager.replayPartitionsLogs(RecoveryManager.java:187)
>     at org.apache.asterix.app.nc.RecoveryManager.startLocalRecovery(RecoveryManager.java:179)
>     at org.apache.asterix.app.nc.task.LocalRecoveryTask.perform(LocalRecoveryTask.java:43)
>     at org.apache.asterix.app.replication.message.StartupTaskResponseMessage.handle(StartupTaskResponseMessage.java:53)
>     at org.apache.asterix.messaging.NCMessageBroker.receivedMessage(NCMessageBroker.java:92)
>     at org.apache.hyracks.control.nc.work.ApplicationMessageWork.run(ApplicationMessageWork.java:54)
>     ... 1 more
> Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: Cannot allocate dataset
245 memory since memory budget would be exceeded.
>     at org.apache.asterix.common.context.DatasetLifecycleManager.allocateMemory(DatasetLifecycleManager.java:566)
>     at org.apache.hyracks.storage.common.buffercache.ResourceHeapBufferAllocator.reserveAllocation(ResourceHeapBufferAllocator.java:53)
>     at org.apache.hyracks.storage.am.lsm.common.impls.VirtualBufferCache.open(VirtualBufferCache.java:307)
>     at org.apache.hyracks.storage.am.lsm.common.impls.MultitenantVirtualBufferCache.open(MultitenantVirtualBufferCache.java:119)
>     at org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTree.allocateMemoryComponent(LSMBTree.java:602)
>     at org.apache.hyracks.storage.am.lsm.common.impls.AbstractLSMIndex.allocateMemoryComponents(AbstractLSMIndex.java:386)
>     at org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.modify(LSMHarness.java:417)
>     at org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.forceModify(LSMHarness.java:364)
>     at org.apache.hyracks.storage.am.lsm.common.impls.LSMTreeIndexAccessor.forceUpsert(LSMTreeIndexAccessor.java:181)
>     at org.apache.asterix.app.nc.RecoveryManager.redo(RecoveryManager.java:707)
>     ... 8 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message