hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Oleksii Dymytrov (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-5924) Resource Manager fails to load state with InvalidProtocolBufferException
Date Tue, 22 Nov 2016 14:55:58 GMT

     [ https://issues.apache.org/jira/browse/YARN-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Oleksii Dymytrov updated YARN-5924:
-----------------------------------
    Attachment: YARN-5924-branch-3.0.0-alpha1.001.patch

> Resource Manager fails to load state with InvalidProtocolBufferException
> ------------------------------------------------------------------------
>
>                 Key: YARN-5924
>                 URL: https://issues.apache.org/jira/browse/YARN-5924
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Oleksii Dymytrov
>            Assignee: Oleksii Dymytrov
>         Attachments: YARN-5924-branch-3.0.0-alpha1.001.patch
>
>
> InvalidProtocolBufferException is thrown during recovering of the application's state
if application's data has invalid format (or is broken) under FSRMStateRoot/RMAppRoot/application_1477986176766_0134/
directory in HDFS:
> {noformat}
> com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did
not match expected tag.
> 	at com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:94)
> 	at com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
> 	at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:143)
> 	at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:176)
> 	at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:188)
> 	at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:193)
> 	at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
> 	at org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$ApplicationStateDataProto.parseFrom(YarnServerResourceManagerRecoveryProtos.java:1028)
> 	at org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore$RMAppStateFileProcessor.processChildNode(FileSystemRMStateStore.java:966)
> 	at org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.processDirectoriesOfFiles(FileSystemRMStateStore.java:317)
> 	at org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadRMAppState(FileSystemRMStateStore.java:281)
> 	at org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadState(FileSystemRMStateStore.java:232)
> {noformat}
> The solution can be to catch "InvalidProtocolBufferException", show warning and remove
application's folder that contains invalid data to prevent RM restart failure. 
> Additionally, I've added catch for other exceptions that can appear during recovering
of the specific application, to avoid RM failure even if the only one application's state
can't be loaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message