reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergiy Matusevych (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Deleted] (REEF-1796) All REEF YARN jobs end with FORCE_CLOSED status on the client side
Date Mon, 15 May 2017 22:12:04 GMT

     [ https://issues.apache.org/jira/browse/REEF-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sergiy Matusevych updated REEF-1796:
------------------------------------
    Comment: was deleted

(was: Looks like we have a Driver setup issue. I've just noticed the following message in
HelloREEF Driver log on YARN:
{code}
2017-05-12 14:14:30,346 FINE reef.runtime.common.driver.client.RemoteClientJobStatusHandler.<init>
main | Instantiated 'RemoteClientJobStatusHandler' without an actual connection to the client.
{code}
So it's not that we close the connection prematurely - it is likely we don't have it at all
:))

> All REEF YARN jobs end with FORCE_CLOSED status on the client side
> ------------------------------------------------------------------
>
>                 Key: REEF-1796
>                 URL: https://issues.apache.org/jira/browse/REEF-1796
>             Project: REEF
>          Issue Type: Bug
>            Reporter: Sergiy Matusevych
>            Priority: Critical
>              Labels: bug, yarn
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> It looks like the connection between REEF driver and REEF client is being closed prematurely
either on the client or on the driver side when using YARN runtime. As a result, REEF driver
fails to communicate its final status to the client, and the client times out with {{FORCE_CLOSED}}
status. That causes *all* unit tests to fail on YARN. From the logs it can be seen that REEF
jobs complete normally on YARN; it's just the final status message that does not make it to
the client.
> To reproduce: run e.g. HelloREEF application on YARN,
> {code}
> .\bin\runreef.ps1 -VerboseLog -Jars .\lang\java\reef-examples\target\reef-examples-0.16.0-SNAPSHOT-shaded.jar
-Class org.apache.reef.examples.hello.HelloREEFYarn
> {code}
> or run unt tests
> {code}
> .\bin\runtests.ps1 -Yarn -Jars ".\lang\java\reef-examples\target\reef-examples-0.16.0-SNAPSHOT-shaded.jar;.\lang\java\reef-tests\target\reef-tests-0.16.0-SNAPSHOT-test-jar-with-dependencies.jar"
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message