reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergiy Matusevych (JIRA)" <>
Subject [jira] [Commented] (REEF-1796) All REEF YARN jobs end with FORCE_CLOSED status on the client side
Date Mon, 15 May 2017 21:16:04 GMT


Sergiy Matusevych commented on REEF-1796:

Looks like we have a Driver setup issue. I've just noticed the following message in HelloREEF
Driver log on YARN:
2017-05-12 14:14:30,346 FINE reef.runtime.common.driver.client.RemoteClientJobStatusHandler.<init>
main | Instantiated 'RemoteClientJobStatusHandler' without an actual connection to the client.
So it's not that we close the connection prematurely - it is likely we don't have it at all

> All REEF YARN jobs end with FORCE_CLOSED status on the client side
> ------------------------------------------------------------------
>                 Key: REEF-1796
>                 URL:
>             Project: REEF
>          Issue Type: Bug
>            Reporter: Sergiy Matusevych
>            Priority: Critical
>              Labels: bug, yarn
>   Original Estimate: 336h
>  Remaining Estimate: 336h
> It looks like the connection between REEF driver and REEF client is being closed prematurely
either on the client or on the driver side when using YARN runtime. As a result, REEF driver
fails to communicate its final status to the client, and the client times out with {{FORCE_CLOSED}}
status. That causes *all* unit tests to fail on YARN. From the logs it can be seen that REEF
jobs complete normally on YARN; it's just the final status message that does not make it to
the client.
> To reproduce: run e.g. HelloREEF application on YARN,
> {code}
> .\bin\runreef.ps1 -VerboseLog -Jars .\lang\java\reef-examples\target\reef-examples-0.16.0-SNAPSHOT-shaded.jar
-Class org.apache.reef.examples.hello.HelloREEFYarn
> {code}
> or run unt tests
> {code}
> .\bin\runtests.ps1 -Yarn -Jars ".\lang\java\reef-examples\target\reef-examples-0.16.0-SNAPSHOT-shaded.jar;.\lang\java\reef-tests\target\reef-tests-0.16.0-SNAPSHOT-test-jar-with-dependencies.jar"
> {code}

This message was sent by Atlassian JIRA

View raw message