reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Weimer <mar...@weimo.de>
Subject Re: [Java] ClientConfiguration.ON_RUNTIME_ERROR handler not being run when Driver submission fails
Date Sun, 15 May 2016 16:15:49 GMT
On 2016-05-13 15:52, Tobin Baker wrote:
> Hi, I just had the YARN ResourceManager fail to launch my Driver
> application because of a file permissions error, but my handler registered
> with ClientConfiguration.ON_RUNTIME_ERROR was never called (nor was
> ClientConfiguration.ON_JOB_FAILED). I assumed that any YARN runtime errors
> launching the Driver should trigger this handler; was I mistaken?

We don't have a test case for when the actual submission fails, so this 
might be a bug in REEF.

The `ON_JOB_FAILED` handler is actually fed from the Driver via the 
network, so it can't be called here. However, it is the logical place to 
do it. `ON_RUNTIME_ERROR` is meant for when the RM itself becomes 
unavailable or such. However, that is fed by the YARN client and thus 
more accessible to us in the case you face above.

Speaking of which: Do you get the exception thrown by 
org.apache.reef.runtime.yarn.client.YarnJobSubmissionHandler.onNext() in 
line 121? If so, that is the place where we can grab it and route it 
into one of the above event handlers.

Markus


Mime
View raw message