Return-Path: X-Original-To: apmail-reef-dev-archive@minotaur.apache.org Delivered-To: apmail-reef-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1BB9A19A37 for ; Mon, 4 Apr 2016 17:15:26 +0000 (UTC) Received: (qmail 10993 invoked by uid 500); 4 Apr 2016 17:15:25 -0000 Delivered-To: apmail-reef-dev-archive@reef.apache.org Received: (qmail 10960 invoked by uid 500); 4 Apr 2016 17:15:25 -0000 Mailing-List: contact dev-help@reef.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@reef.apache.org Delivered-To: mailing list dev@reef.apache.org Received: (qmail 10943 invoked by uid 99); 4 Apr 2016 17:15:25 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Apr 2016 17:15:25 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 8891A2C1F5D for ; Mon, 4 Apr 2016 17:15:25 +0000 (UTC) Date: Mon, 4 Apr 2016 17:15:25 +0000 (UTC) From: "Dhruv Mahajan (JIRA)" To: dev@reef.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (REEF-1310) The Java Driver should ACK the Java Evaluator's DONE heartbeat MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/REEF-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15224541#comment-15224541 ] Dhruv Mahajan commented on REEF-1310: ------------------------------------- [~afchung90] I believe this will also solve REEF-1291 ? > The Java Driver should ACK the Java Evaluator's DONE heartbeat > -------------------------------------------------------------- > > Key: REEF-1310 > URL: https://issues.apache.org/jira/browse/REEF-1310 > Project: REEF > Issue Type: Bug > Components: REEF, REEF Driver, REEF-Common > Reporter: Andrew Chung > > The Driver should ACK the Evaluator's DONE heartbeat such that a race condition does not occur when the Evaluator ends. *i.e.* The Evaluator heartbeats DONE back to the Driver and the RM notices that the Evaluator process has exited. In this case, it is possible that the RM reports back to the Driver that the Evaluator is DONE before the Evaluator's DONE heartbeat goes back to the Driver, causing the Driver to invoke the {{FailedEvaluatorHandler}} due to an unexpected DONE message from the RM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)