mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erik Weathers (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MESOS-4737) document TaskID uniqueness requirement
Date Mon, 22 Feb 2016 21:46:18 GMT

     [ https://issues.apache.org/jira/browse/MESOS-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Erik Weathers updated MESOS-4737:
---------------------------------
    Description: 
There are comments above the definition of TaskID in [mesos.proto|https://github.com/apache/mesos/blob/0.27.0/include/mesos/mesos.proto#L63-L66]
which lead one to believe it is ok to reuse TaskID values so long as you guarantee there will
only ever be 1 such TaskID running at the same time.

{code: title=existing comments for TaskID}
 * A framework generated ID to distinguish a task. The ID must remain
 * unique while the task is active. However, a framework can reuse an
 * ID _only_ if a previous task with the same ID has reached a
 * terminal state (e.g., TASK_FINISHED, TASK_LOST, TASK_KILLED, etc.).
{code}

However, there are a few scenarios where problems can arise.

# The checkpointing-and-recovery feature of mesos-slave/agent clashes with tasks that reuse
an ID and get assigned to the same executor.
#* See [this email|https://mail-archives.apache.org/mod_mbox/mesos-user/201602.mbox/%3CCAO5KYW8%2BXMWc1dXtEo20BAsfGow028jwjL2ubMinP%2BK%2BvdOh8w%40mail.gmail.com%3E]
for more info, as well as the attachment on this issue.
# Issues during network partitions and master failover, where a TaskID might appear to be
unique in the system, whereas in actuality another Task is running with that ID and was just
partitioned away for some time.

In light of these issues, we should simply update the document(s) to make it abundantly clear
that reusing TaskIDs is never ok.  At the minimum this should involve updating the afore-mentioned
comments in {{mesos.proto}}.  Also any framework development guides that talk about TaskID
creation should be updated.

  was:
There are comments above the definition of TaskID in [mesos.proto|https://github.com/apache/mesos/blob/0.27.0/include/mesos/mesos.proto#L63-L66]
which lead one to believe it is ok to reuse TaskID values so long as you guarantee there will
only ever be 1 such TaskID running at the same time.

{code title=existing comments for TaskID}
 * A framework generated ID to distinguish a task. The ID must remain
 * unique while the task is active. However, a framework can reuse an
 * ID _only_ if a previous task with the same ID has reached a
 * terminal state (e.g., TASK_FINISHED, TASK_LOST, TASK_KILLED, etc.).
{code}

However, there are a few scenarios where problems can arise.

# The checkpointing-and-recovery feature of mesos-slave/agent clashes with tasks that reuse
an ID and get assigned to the same executor.
#* See [this email|https://mail-archives.apache.org/mod_mbox/mesos-user/201602.mbox/%3CCAO5KYW8%2BXMWc1dXtEo20BAsfGow028jwjL2ubMinP%2BK%2BvdOh8w%40mail.gmail.com%3E]
for more info, as well as the attachment on this issue.
# Issues during network partitions and master failover, where a TaskID might appear to be
unique in the system, whereas in actuality another Task is running with that ID and was just
partitioned away for some time.

In light of these issues, we should simply update the document(s) to make it abundantly clear
that reusing TaskIDs is never ok.  At the minimum this should involve updating the afore-mentioned
comments in {{mesos.proto}}.  Also any framework development guides that talk about TaskID
creation should be updated.


> document TaskID uniqueness requirement
> --------------------------------------
>
>                 Key: MESOS-4737
>                 URL: https://issues.apache.org/jira/browse/MESOS-4737
>             Project: Mesos
>          Issue Type: Task
>          Components: documentation
>    Affects Versions: 0.27.0
>            Reporter: Erik Weathers
>            Assignee: Erik Weathers
>            Priority: Minor
>              Labels: documentation
>
> There are comments above the definition of TaskID in [mesos.proto|https://github.com/apache/mesos/blob/0.27.0/include/mesos/mesos.proto#L63-L66]
which lead one to believe it is ok to reuse TaskID values so long as you guarantee there will
only ever be 1 such TaskID running at the same time.
> {code: title=existing comments for TaskID}
>  * A framework generated ID to distinguish a task. The ID must remain
>  * unique while the task is active. However, a framework can reuse an
>  * ID _only_ if a previous task with the same ID has reached a
>  * terminal state (e.g., TASK_FINISHED, TASK_LOST, TASK_KILLED, etc.).
> {code}
> However, there are a few scenarios where problems can arise.
> # The checkpointing-and-recovery feature of mesos-slave/agent clashes with tasks that
reuse an ID and get assigned to the same executor.
> #* See [this email|https://mail-archives.apache.org/mod_mbox/mesos-user/201602.mbox/%3CCAO5KYW8%2BXMWc1dXtEo20BAsfGow028jwjL2ubMinP%2BK%2BvdOh8w%40mail.gmail.com%3E]
for more info, as well as the attachment on this issue.
> # Issues during network partitions and master failover, where a TaskID might appear to
be unique in the system, whereas in actuality another Task is running with that ID and was
just partitioned away for some time.
> In light of these issues, we should simply update the document(s) to make it abundantly
clear that reusing TaskIDs is never ok.  At the minimum this should involve updating the afore-mentioned
comments in {{mesos.proto}}.  Also any framework development guides that talk about TaskID
creation should be updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message