mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andreas Raster (JIRA)" <>
Subject [jira] [Created] (MESOS-1845) CommandInfo tasks may fail if scheduled after another task with the same id has finished.
Date Tue, 30 Sep 2014 12:09:34 GMT
Andreas Raster created MESOS-1845:

             Summary: CommandInfo tasks may fail if scheduled after another task with the
same id has finished.
                 Key: MESOS-1845
             Project: Mesos
          Issue Type: Bug
            Reporter: Andreas Raster

I created a little test framework where I wanted to experiment with scheduling tasks where
running one task relies on the results of another, previously run task. So in my test framework
I would first schedule a task that would append the string "foo" to a file, and after that
one finishes I would schedule a task that appends "bar" to the same file.

This worked well when using ExecutorInfo, but when I switched to using CommandInfo instead
(specifying commands like 'echo foo >> /share/foobar.txt' in set_value()), it would
most of the time fail in the second step when attempting to append "bar". Occasionally, but
very rarely, it would work though.

I couldn't find any meaningful log messages indicating what exactly went wrong. The slave
log would indicate that the tasks status changed to TASK_FAILED and that that status update
was sent correctly. The stdout log in the Sandbox would indicate that the command 'exited
with status 0'.

I could work around the issue when I specified task ids that were always unique. Previously
I would reuse the id of a previously run task, one that appended "foo" to a file, after it
finished in the followup task that would append "bar" to a file.

It seems to me there might be something wrong when scheduling very short running tasks with
the same id quickly after each other.

Source code for my foobar framework:

Build with:
g++ -std=c++0x -g -Wall foobar_framework.cpp -I. -L/usr/local/lib -lmesos -o foobar-framework

This message was sent by Atlassian JIRA

View raw message