incubator-mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Charles Reiss (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MESOS-17) Hadoop executors killed while tasks in COMMIT_PENDING
Date Mon, 06 Jun 2011 21:20:59 GMT

     [ https://issues.apache.org/jira/browse/MESOS-17?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Charles Reiss updated MESOS-17:
-------------------------------

    Description: The Hadoop framework considers tasks finished when they are in the COMMIT_PENDING
state. When using the LXC isolation module, this can cause the Hadoop executor's memory allocation
to be reduced before the task actually commits. When this happens, the Hadoop executor is
sometimes killed for exceeding its memory allocation, leaving the tasks stalled until the
master detects the lost task tracker by timeout.  (was: When using the LXC isolation module,
the Hadoop framework considers tasks finished when they are in the COMMIT_PENDING state. This
can cause the Hadoop executor's memory allocation to be reduced before the task actually commits.
When this happens, the Hadoop executor is sometimes killed for exceeding its memory allocation,
leaving the tasks stalled until the master detects the lost task tracker by timeout.)

> Hadoop executors killed while tasks in COMMIT_PENDING
> -----------------------------------------------------
>
>                 Key: MESOS-17
>                 URL: https://issues.apache.org/jira/browse/MESOS-17
>             Project: Mesos
>          Issue Type: Bug
>          Components: isolation
>         Environment: LXC isolation module, Hadoop framework
>            Reporter: Charles Reiss
>            Priority: Minor
>              Labels: hadoop, lxc
>
> The Hadoop framework considers tasks finished when they are in the COMMIT_PENDING state.
When using the LXC isolation module, this can cause the Hadoop executor's memory allocation
to be reduced before the task actually commits. When this happens, the Hadoop executor is
sometimes killed for exceeding its memory allocation, leaving the tasks stalled until the
master detects the lost task tracker by timeout.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message