Return-Path: X-Original-To: apmail-reef-dev-archive@minotaur.apache.org Delivered-To: apmail-reef-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3EF4219DEB for ; Thu, 10 Mar 2016 02:19:41 +0000 (UTC) Received: (qmail 35992 invoked by uid 500); 10 Mar 2016 02:19:41 -0000 Delivered-To: apmail-reef-dev-archive@reef.apache.org Received: (qmail 35909 invoked by uid 500); 10 Mar 2016 02:19:41 -0000 Mailing-List: contact dev-help@reef.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@reef.apache.org Delivered-To: mailing list dev@reef.apache.org Received: (qmail 35732 invoked by uid 99); 10 Mar 2016 02:19:40 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Mar 2016 02:19:40 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id D5C3B2C044E for ; Thu, 10 Mar 2016 02:19:40 +0000 (UTC) Date: Thu, 10 Mar 2016 02:19:40 +0000 (UTC) From: "Dhruv Mahajan (JIRA)" To: dev@reef.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (REEF-1244) Group Communication does not close down properly at the end if reej job MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/REEF-1244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188518#comment-15188518 ] Dhruv Mahajan commented on REEF-1244: ------------------------------------- [~markus.weimer] I have a question for this. How can we differentiate between the streams actually failing vs. they are failing because we are closing them at one end. Because we will be sort of blocked on Read or ReadAsync() on one end. Do we want to differentiate at all? > Group Communication does not close down properly at the end if reej job > ----------------------------------------------------------------------- > > Key: REEF-1244 > URL: https://issues.apache.org/jira/browse/REEF-1244 > Project: REEF > Issue Type: Bug > Components: GroupCommunications > Affects Versions: 0.13 > Environment: C# > Reporter: Dhruv Mahajan > Assignee: Dhruv Mahajan > Fix For: 0.13 > > > Currently, when we want to shut down evaluator, the dispose function of group communications will be called. However, there is a race condition that occurs. For example, suppose evaluator e1 calls dispose and closes the stream with evaluator e2. Then if e2 is in ReadAsync() function of the stream, we will get a failure since Dispose() function in e2 is still not called. Moreover, the Dispose() function in e2 will try to close the already closed stream again. > Some of these scenarios are handled by catching Exceptions and ignoring them but some are not captured and throw errors which leads to driver and reef job failing. > The aim of this JIRA is to identify all these closing scenarios and handle them appropriately. -- This message was sent by Atlassian JIRA (v6.3.4#6332)