reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Markus Weimer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (REEF-1407) Catching exceptions in group communication in failure case
Date Mon, 06 Jun 2016 15:29:21 GMT

    [ https://issues.apache.org/jira/browse/REEF-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15316649#comment-15316649
] 

Markus Weimer commented on REEF-1407:
-------------------------------------

I'm very much in favor of capitalizing on Rx for this. We went out of our way to implement
parts of this .NET API in Java. It would be a shame not to use it in .NET. Hence, I think
we should Rx-ify the communications layer:

  * Remote network connections become {{Observer}}s or {{Observable}}s.
  * All operators become {{Observer}}s or {{Observable}}s.
 
Error propagation can then happen naturally via {{OnError()}} and termination is done via
{{OnComplete}}. For example, an ending Task would call {{_reduceReceiver.OnComplete()}} to
indicate that there won't be any reduce input from this Task anymore.

Further, this allows us to use more of Rx in the code, especially the scheduling functionality
and the many operators (filter, group, map, ...) that can be applied to Rx streams. I hope
that this would greatly simplify the code in the long run.

That said, we shouldn't do all of this to fix this bug. But it means that I am actually in
favor of solution a above, as it gets us closer to that more .NET native design.

[~anupam128], can you please fact-check the above? I don't have extensive experience with
those parts of the .NET API :)

> Catching exceptions in group communication in failure case
> ----------------------------------------------------------
>
>                 Key: REEF-1407
>                 URL: https://issues.apache.org/jira/browse/REEF-1407
>             Project: REEF
>          Issue Type: Bug
>            Reporter: Julia
>            Assignee: Dhruv Mahajan
>              Labels: FT
>
> Currently when a task fails, other tasks in the group are stuck in reading data by a
blocking call. We should be able to try and throw an exception and propagate the exception
to Task so that the task can handle it in a proper way. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message