reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Taegeon Um (JIRA)" <j...@apache.org>
Subject [jira] [Created] (REEF-1850) IMRU test fails on Yarn with 800+ nodes
Date Sun, 06 Aug 2017 13:05:00 GMT
Taegeon Um created REEF-1850:
--------------------------------

             Summary: IMRU test fails on Yarn with 800+ nodes
                 Key: REEF-1850
                 URL: https://issues.apache.org/jira/browse/REEF-1850
             Project: REEF
          Issue Type: Bug
          Components: IMRU
            Reporter: Taegeon Um


>From [~juliaw]'s experiments, we've found that IMRU test fails on Yarn with 800+ nodes.


With 500 nodes, test pass.
With 1000 nodes, test fails. Received 1000 completed tasks but only 998 completed evaluators.
Drive doesn’t shut down until I kill it.
With 800 nodes, test fails. Received 800 completed tasks but only 799 completed evaluators.
Drive doesn’t shut down until I kill it.

We need to investigate this scalability issue and find a root cause. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message