hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gergo Repas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-7585) NodeManager should go unhealthy when state store throws DBException
Date Tue, 02 Jan 2018 12:49:04 GMT

    [ https://issues.apache.org/jira/browse/YARN-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16307968#comment-16307968
] 

Gergo Repas commented on YARN-7585:
-----------------------------------

+1 (non-binding)

> NodeManager should go unhealthy when state store throws DBException 
> --------------------------------------------------------------------
>
>                 Key: YARN-7585
>                 URL: https://issues.apache.org/jira/browse/YARN-7585
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>            Reporter: Wilfred Spiegelenburg
>            Assignee: Wilfred Spiegelenburg
>         Attachments: YARN-7585.001.patch, YARN-7585.002.patch, YARN-7585.003.patch
>
>
> If work preserving recover is enabled the NM will not start up if the state store does
not initialise. However if the state store becomes unavailable after that for any reason the
NM will not go unhealthy. 
> Since the state store is not available new containers can not be started any more and
the NM should become unhealthy:
> {code}
> AMLauncher: Error launching appattempt_1508806289867_268617_000001. Got exception: org.apache.hadoop.yarn.exceptions.YarnException:
java.io.IOException: org.iq80.leveldb.DBException: IO error: /dsk/app/var/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state/028269.log:
Read-only file system
> at o.a.h.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38)
> at o.a.h.y.s.n.cm.ContainerManagerImpl.startContainers(ContainerManagerImpl.java:721)
> ...
> Caused by: java.io.IOException: org.iq80.leveldb.DBException: IO error: /dsk/app/var/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state/028269.log:
Read-only file system
> at o.a.h.y.s.n.r.NMLeveldbStateStoreService.storeApplication(NMLeveldbStateStoreService.java:374)
> at o.a.h.y.s.n.cm.ContainerManagerImpl.startContainerInternal(ContainerManagerImpl.java:848)
> at o.a.h.y.s.n.cm.ContainerManagerImpl.startContainers(ContainerManagerImpl.java:712)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message