hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shinichi Yamashita (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-1578) Fix how to handle ApplicationHistory about the container
Date Thu, 30 Jan 2014 06:45:05 GMT

     [ https://issues.apache.org/jira/browse/YARN-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shinichi Yamashita updated YARN-1578:
-------------------------------------

    Attachment: screenshot2.pdf

I recognized that code which you showed was FileSystemApplicationHistoryStore.getContainer(ContainerId)
method.
That code is OK, and we can watch information of ApplicationMaster in web UI.

And when I access the information of the list of containers from a link of AppAttempt, web
UI displays "500" (attach screenshot2.pdf).
I'm sorry about unkindness of my explanation.

By this access, AHS calls FileSystemApplicationHistoryStore.getContainers(ApplicationAttemptId)
and ContainerFinishData is not set with the following code.
{code}
    HistoryFileReader hfReader =
        getHistoryFileReader(appAttemptId.getApplicationId());
    try {
      while (hfReader.hasNext()) {
        HistoryFileReader.Entry entry = hfReader.next();
        if (entry.key.id.startsWith(ConverterUtils.CONTAINER_PREFIX)) {
          if (entry.key.suffix.equals(START_DATA_SUFFIX)) {
            retrieveStartFinishData(appAttemptId, entry, startFinshDataMap,
              true);
          } else if (entry.key.suffix.equals(FINISH_DATA_SUFFIX)) {
            retrieveStartFinishData(appAttemptId, entry, startFinshDataMap,
              false);
          }
        }
      }
      LOG.info("Completed reading history information of all conatiners"
          + " of application attempt " + appAttemptId);
    } catch (IOException e) {
      LOG.info("Error when reading history information of some containers"
          + " of application attempt " + appAttemptId);
    } finally {
      hfReader.close();
    }
{code}

In consideration of the possibility that finish data was not included in history file, I thought
that we should fix how to read history file in FileSystemApplicationHistoryStore.

> Fix how to handle ApplicationHistory about the container
> --------------------------------------------------------
>
>                 Key: YARN-1578
>                 URL: https://issues.apache.org/jira/browse/YARN-1578
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>    Affects Versions: YARN-321
>            Reporter: Shinichi Yamashita
>            Assignee: Shinichi Yamashita
>         Attachments: YARN-1578.patch, application_1390978867235_0001, resoucemanager.log,
screenshot.png, screenshot2.pdf
>
>
> I carried out PiEstimator job at Hadoop cluster which applied YARN-321.
> After the job end and when I accessed Web UI of HistoryServer, it displayed "500". And
HistoryServer daemon log was output as follows.
> {code}
> 2014-01-09 13:31:12,227 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error handling
URI: /applicationhistory/appattempt/appattempt_1389146249925_0008_000001
> java.lang.reflect.InvocationTargetException
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:153)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> (snip...)
> Caused by: java.lang.NullPointerException
>         at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.mergeContainerHistoryData(FileSystemApplicationHistoryStore.java:696)
>         at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.getContainers(FileSystemApplicationHistoryStore.java:429)
>         at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getContainers(ApplicationHistoryManagerImpl.java:201)
>         at org.apache.hadoop.yarn.server.webapp.AppAttemptBlock.render(AppAttemptBlock.java:110)
> (snip...)
> {code}
> I confirmed that there was container which was not finished from ApplicationHistory file.
> In ResourceManager daemon log, ResourceManager reserved this container, but did not allocate
it.
> Therefore, about a container which is not allocated, it is necessary to change how to
handle in ApplicationHistory.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message