reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dhruv Mahajan (JIRA)" <>
Subject [jira] [Commented] (REEF-1407) Catching exceptions in group communication are implemented incorrectly
Date Mon, 13 Jun 2016 22:14:04 GMT


Dhruv Mahajan commented on REEF-1407:

[~juliaw] The steps/issue mentioned by you are right. Regarding proposed solutions:

We have not come up with exact one yet. The connection might not be still direct but through
sequence of well informed observers that have sufficient information to ping the clients/observers
upstream. One suggestion give by [~afchung90] was to use NameClient and NameServer for registering
client names whenever a new connection is created.

I do not see an issues with multiple groups. IF an error happens in a connection, nodes in
all the groups using that connection will be notified via appropriate observers.

Regarding scalability, I am not sure what is the concern. Worst case is the flat topology,
where master has to maintain P observers or so. I do not see that causing any issue also.
As said earlier, with large P using Flat topology itself is a bad idea and something that
should be avoided.

> Catching exceptions in group communication are implemented incorrectly
> ----------------------------------------------------------------------
>                 Key: REEF-1407
>                 URL:
>             Project: REEF
>          Issue Type: Bug
>            Reporter: Julia
>            Assignee: Dhruv Mahajan
>              Labels: FT
> Currently when a task fails, other tasks in the group are stuck in reading data by a
blocking call. We should be able to try and throw an exception and propagate the exception
to Task so that the task can handle it in a proper way. 

This message was sent by Atlassian JIRA

View raw message