hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhijie Shen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2330) Jobs are not displaying in timeline server after RM restart
Date Tue, 22 Jul 2014 07:41:39 GMT

    [ https://issues.apache.org/jira/browse/YARN-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14069943#comment-14069943
] 

Zhijie Shen commented on YARN-2330:
-----------------------------------

It is very likely that the history file of the given application is still not closed for writing
(for example, after RM restarting, RM reopen the history file to append the history information).
On the other side, the reader want to scan the file under writing.

The following logic is broken, because writer is invoked on RM, while reader is invoked on
timeline server. Hence, from the point of view of reader. outstandingWriters is always empty.
This cannot be used to indicate whether a file was opened for writing or not,
{code}
    // The history file is still under writing
    if (outstandingWriters.containsKey(appId)) {
      throw new IOException("History file for application " + appId
          + " is under writing");
    }
{code}

> Jobs are not displaying in timeline server after RM restart
> -----------------------------------------------------------
>
>                 Key: YARN-2330
>                 URL: https://issues.apache.org/jira/browse/YARN-2330
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: 2.4.1
>         Environment: Nodemanagers 3 (3*8GB)
> Queues A = 70%
> Queues B = 30%
>            Reporter: Nishan Shetty
>
> Submit jobs to queue a
> While job is running Restart RM 
> Observe that those jobs are not displayed in timelineserver
> {code}
> 2014-07-22 10:11:32,084 ERROR org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore:
History information of application application_1406002968974_0003 is not included into the
result due to the exception
> java.io.IOException: Cannot seek to negative offset
> 	at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1381)
> 	at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:63)
> 	at org.apache.hadoop.io.file.tfile.BCFile$Reader.<init>(BCFile.java:624)
> 	at org.apache.hadoop.io.file.tfile.TFile$Reader.<init>(TFile.java:804)
> 	at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore$HistoryFileReader.<init>(FileSystemApplicationHistoryStore.java:683)
> 	at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.getHistoryFileReader(FileSystemApplicationHistoryStore.java:661)
> 	at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.getApplication(FileSystemApplicationHistoryStore.java:146)
> 	at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.getAllApplications(FileSystemApplicationHistoryStore.java:199)
> 	at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getAllApplications(ApplicationHistoryManagerImpl.java:103)
> 	at org.apache.hadoop.yarn.server.webapp.AppsBlock.render(AppsBlock.java:75)
> 	at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:66)
> 	at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:76)
> 	at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
> 	at org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
> 	at org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117)
> 	at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845)
> 	at org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:56)
> 	at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
> 	at org.apache.hadoop.yarn.webapp.Dispatcher.render(Dispatcher.java:197)
> 	at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:156)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> 	at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263)
> 	at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178)
> 	at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
> 	at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62)
> 	at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900)
> 	at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834)
> 	at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795)
> 	at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
> 	at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
> 	at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118)
> 	at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113)
> 	at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> 	at org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109)
> 	at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> 	at org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1192)
> 	at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> 	at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
> 	at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message