reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dhruv Mahajan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (REEF-1407) Catching exceptions in group communication in failure case
Date Tue, 31 May 2016 21:41:12 GMT

    [ https://issues.apache.org/jira/browse/REEF-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15308684#comment-15308684
] 

Dhruv Mahajan commented on REEF-1407:
-------------------------------------

I am responding to this statement: {{If we cannot get data after timeout, throw exception
and propagate the exception to IMRU task. In IMRU task, if the task doesn't receive Close
event, it can retry to take the data again. }}.

If in Broadcast we timeout after receiving 100 chunks while another 100 chunks are left, where
are the first 100 chunks. If you look at the code these 100 chunks are stored locally in Broadcast
function call. So once we exit the call after timeout they are lost. How do we expect to recover
them once IMRUTaskHost calls broadcast again?

> Catching exceptions in group communication in failure case
> ----------------------------------------------------------
>
>                 Key: REEF-1407
>                 URL: https://issues.apache.org/jira/browse/REEF-1407
>             Project: REEF
>          Issue Type: Bug
>            Reporter: Julia
>              Labels: FT
>
> Currently when a task fails, other tasks in the group are stuck in reading data by a
blocking call. We should be able to try and throw an exception and propagate the exception
to Task so that the task can handle it in a proper way. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message