mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mesos ReviewBot" <...@mesos.apache.org>
Subject Re: Review Request 25035: Fix for MESOS-1688
Date Sat, 06 Sep 2014 23:19:47 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25035/#review52550
-----------------------------------------------------------


Patch looks great!

Reviews applied: [25035]

All tests passed.

- Mesos ReviewBot


On Sept. 6, 2014, 10:03 p.m., Martin Weindel wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25035/
> -----------------------------------------------------------
> 
> (Updated Sept. 6, 2014, 10:03 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-1688
>     https://issues.apache.org/jira/browse/MESOS-1688
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As already explained in JIRA MESOS-1688, there are schedulers allocating memory only
for the executor and not for tasks. For tasks only CPU resources are allocated in this case.
> Such a scheduler does not get offered any idle CPUs if the slave has nearly used up all
memory.
> This can easily lead to a dead lock (in the application, not in Mesos).
> 
> Simple example:
> 1. Scheduler allocates all memory of a slave for an executor
> 2. Scheduler launches a task for this executor (allocating 1 CPU)
> 3. Task finishes: 1 CPU , 0 MB memory allocatable.
> 4. No offers are made, as no memory is left. Scheduler will wait for offers forever.
Dead lock in the application.
> 
> To fix this problem, offers must be made if CPU resources are allocatable without considering
allocatable memory
> 
> 
> Diffs
> -----
> 
>   src/common/resources.cpp edf36b1 
>   src/master/constants.hpp ce7995b 
>   src/master/constants.cpp faa1503 
>   src/master/hierarchical_allocator_process.hpp 34f8cd6 
>   src/master/master.cpp 18464ba 
>   src/tests/allocator_tests.cpp 774528a 
> 
> Diff: https://reviews.apache.org/r/25035/diff/
> 
> 
> Testing
> -------
> 
> Deployed patched Mesos 0.19.1 on a small cluster with 3 slaves and tested running multiple
parallel Spark jobs in "fine-grained" mode to saturate allocatable memory. The jobs run fine
now. This load always caused a dead lock in all Spark jobs within one minute with the unpatched
Mesos.
> 
> 
> Thanks,
> 
> Martin Weindel
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message