reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Weimer <mar...@weimo.de>
Subject Re: Driver doesn't shut down when running 1000 nodes
Date Sun, 14 May 2017 23:58:45 GMT
Thanks for checking! My *hunch* would be to start with the changes
done to maker REEF a library. @Serigiy, can you narrow it down even
more? -- Markus

On Thu, May 11, 2017 at 7:33 PM, Julia Wang (QIUHE)
<Qiuhe.Wang@microsoft.com.invalid> wrote:
> When I run IMRU Example with 1000 nodes on cluster with the latest master bits, I noticed
the driver is not able to be shut down. Looking into detail logs, not all the CompletedEvaluator
events are received (missing 1 or 2 in different tests) even if all the CompletedTask events
are received and our code has called Dispose() for all the active contexts.
>
> The test with 1000 nodes on the REEF last Dec bits can be shut down successfully.
>
> Anyone is aware of any possible related changes in the past few month? If no clue, I
might do a binary search to try out when the issue started :).
>
> I have logged REEF-1797<https://issues.apache.org/jira/browse/REEF-1797> for it.
>
> Thanks,
> Julia

Mime
View raw message