hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhijie Shen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1578) Fix how to read history file in FileSystemApplicationHistoryStore
Date Mon, 03 Feb 2014 19:19:06 GMT

    [ https://issues.apache.org/jira/browse/YARN-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889782#comment-13889782

Zhijie Shen commented on YARN-1578:

[~sinchii], thanks for the patch, which I thinks has found the buggy code to fix. I've some
comments on the fix:

1. Please refer to getContainer(). We should still merge partial information into container
history data object when we only have the start/finish data. The get APIs should be tolerant
for missing information. 

2. getApplicationAttempts need to be fixed as well.

3. Please create some test cases to imitate missing partial data, and the get APIs still work.

> Fix how to read history file in FileSystemApplicationHistoryStore
> -----------------------------------------------------------------
>                 Key: YARN-1578
>                 URL: https://issues.apache.org/jira/browse/YARN-1578
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>    Affects Versions: YARN-321
>            Reporter: Shinichi Yamashita
>            Assignee: Shinichi Yamashita
>         Attachments: YARN-1578-2.patch, YARN-1578.patch, application_1390978867235_0001,
resoucemanager.log, screenshot.png, screenshot2.pdf
> I carried out PiEstimator job at Hadoop cluster which applied YARN-321.
> After the job end and when I accessed Web UI of HistoryServer, it displayed "500". And
HistoryServer daemon log was output as follows.
> {code}
> 2014-01-09 13:31:12,227 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error handling
URI: /applicationhistory/appattempt/appattempt_1389146249925_0008_000001
> java.lang.reflect.InvocationTargetException
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:153)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> (snip...)
> Caused by: java.lang.NullPointerException
>         at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.mergeContainerHistoryData(FileSystemApplicationHistoryStore.java:696)
>         at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.getContainers(FileSystemApplicationHistoryStore.java:429)
>         at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getContainers(ApplicationHistoryManagerImpl.java:201)
>         at org.apache.hadoop.yarn.server.webapp.AppAttemptBlock.render(AppAttemptBlock.java:110)
> (snip...)
> {code}
> I confirmed that there was container which was not finished from ApplicationHistory file.
> In ResourceManager daemon log, ResourceManager reserved this container, but did not allocate
> When FileSystemApplicationHistoryStore reads container information without finish data
in history file, this problem occurs.
> In consideration of the case which there is not finish data, we should fix how to read
history file in FileSystemApplicationHistoryStore.

This message was sent by Atlassian JIRA

View raw message