reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Markus Weimer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (REEF-1388) Fix RunningTask to be sent for short-lived .NET tasks
Date Wed, 11 May 2016 23:01:13 GMT

    [ https://issues.apache.org/jira/browse/REEF-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280955#comment-15280955
] 

Markus Weimer commented on REEF-1388:
-------------------------------------

That doesn't sound right. We should send the heartbeat immediately after we call {{Task.call()}}.
Not sure how to make that happen without elaborate locking. Can you check the java code?

> Fix RunningTask to be sent for short-lived .NET tasks
> -----------------------------------------------------
>
>                 Key: REEF-1388
>                 URL: https://issues.apache.org/jira/browse/REEF-1388
>             Project: REEF
>          Issue Type: Bug
>          Components: REEF.NET
>            Reporter: Mariia Mykhailova
>            Assignee: Mariia Mykhailova
>              Labels: FT
>
> Currently our task start handling code works as follows:
> 1. Send INIT message to driver.
> 2. Start task.
> 3. Send status updates as periodic heartbeat with 4 seconds period; first RUNNING status
received by java code triggers RunningTask event.
> If the task completes fast enough, periodic heartbeat might not catch task in process
of execution, and thus driver will never receive RunningTask event. All our tests which rely
on RunningTask have tasks which either sleep for 5+ seconds or wait until a RunningTask handler
sends a message to the task, so they never uncover this issue. This seems to be a bad design.
We need to fix this (and probably also reduce amount of sleep in some tests in spirit of REEF-1203).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message