Return-Path: X-Original-To: apmail-mesos-issues-archive@minotaur.apache.org Delivered-To: apmail-mesos-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 542E917391 for ; Tue, 30 Sep 2014 12:09:34 +0000 (UTC) Received: (qmail 38342 invoked by uid 500); 30 Sep 2014 12:09:34 -0000 Delivered-To: apmail-mesos-issues-archive@mesos.apache.org Received: (qmail 38309 invoked by uid 500); 30 Sep 2014 12:09:34 -0000 Mailing-List: contact issues-help@mesos.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mesos.apache.org Delivered-To: mailing list issues@mesos.apache.org Received: (qmail 38267 invoked by uid 99); 30 Sep 2014 12:09:34 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Sep 2014 12:09:34 +0000 Date: Tue, 30 Sep 2014 12:09:34 +0000 (UTC) From: "Andreas Raster (JIRA)" To: issues@mesos.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (MESOS-1845) CommandInfo tasks may fail if scheduled after another task with the same id has finished. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Andreas Raster created MESOS-1845: ------------------------------------- Summary: CommandInfo tasks may fail if scheduled after another task with the same id has finished. Key: MESOS-1845 URL: https://issues.apache.org/jira/browse/MESOS-1845 Project: Mesos Issue Type: Bug Reporter: Andreas Raster I created a little test framework where I wanted to experiment with scheduling tasks where running one task relies on the results of another, previously run task. So in my test framework I would first schedule a task that would append the string "foo" to a file, and after that one finishes I would schedule a task that appends "bar" to the same file. This worked well when using ExecutorInfo, but when I switched to using CommandInfo instead (specifying commands like 'echo foo >> /share/foobar.txt' in set_value()), it would most of the time fail in the second step when attempting to append "bar". Occasionally, but very rarely, it would work though. I couldn't find any meaningful log messages indicating what exactly went wrong. The slave log would indicate that the tasks status changed to TASK_FAILED and that that status update was sent correctly. The stdout log in the Sandbox would indicate that the command 'exited with status 0'. I could work around the issue when I specified task ids that were always unique. Previously I would reuse the id of a previously run task, one that appended "foo" to a file, after it finished in the followup task that would append "bar" to a file. It seems to me there might be something wrong when scheduling very short running tasks with the same id quickly after each other. Source code for my foobar framework: http://paste.ubuntu.com/8459083 Build with: g++ -std=c++0x -g -Wall foobar_framework.cpp -I. -L/usr/local/lib -lmesos -o foobar-framework -- This message was sent by Atlassian JIRA (v6.3.4#6332)