reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Julia Wang (QIUHE)" <Qiuhe.W...@microsoft.com.INVALID>
Subject Driver doesn't shut down when running 1000 nodes
Date Fri, 12 May 2017 02:33:36 GMT
When I run IMRU Example with 1000 nodes on cluster with the latest master bits, I noticed the
driver is not able to be shut down. Looking into detail logs, not all the CompletedEvaluator
events are received (missing 1 or 2 in different tests) even if all the CompletedTask events
are received and our code has called Dispose() for all the active contexts.

The test with 1000 nodes on the REEF last Dec bits can be shut down successfully.

Anyone is aware of any possible related changes in the past few month? If no clue, I might
do a binary search to try out when the issue started :).

I have logged REEF-1797<https://issues.apache.org/jira/browse/REEF-1797> for it.

Thanks,
Julia

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message