reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tae-Geon Um (JIRA)" <>
Subject [jira] [Commented] (REEF-1729) Fix test job timeouts in Travis CI
Date Fri, 17 Mar 2017 09:41:41 GMT


Tae-Geon Um commented on REEF-1729:


The test time increased after [PR#1174|] is merged.
Before PR#1174 merged, the time was about 25 minutes ([Build#2163|]).
However, after PR#1174 merged, the time increased up to 48 minutes ([Build#2175|]).
As sometimes the build machine could become slow, test job timeouts could happen. 

I don't think the test job timeout is related to runaway thread. Instead, it seems because
of the use of {{[awaitUninterruptibly()|]}},
which waits some quiet periods to completely release resources. We discussed this issue previously
in [REEF-1231|]. 

I think it would be good to remove the {{awaitUniterruptibly()}} codes that was added in PR#1174.
Even though we delete the code, the {{.close()}} is still idempotent. If you are fine with
this change, I will create a PR for it :)

> Fix test job timeouts in Travis CI
> ----------------------------------
>                 Key: REEF-1729
>                 URL:
>             Project: REEF
>          Issue Type: Bug
>            Reporter: Mariia Mykhailova
>            Assignee: Sergiy Matusevych
> Recent changes in the way we're closing threads in Java code during REEF driver shutdown
seem to have introduced a bug in this area. We observe transient test job timeouts in [Travis
CI|]: typically one test job takes 39-41 minutes,
the limit on job duration is 50 minutes, and we're seeing test jobs hitting the limit and
timing out. There is no test failure reported in such cases, so I suspect there is some runaway
unaccounted for thread or an entire test which fails to complete properly.

This message was sent by Atlassian JIRA

View raw message