mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Gallego <>
Subject Re: High latency when scheduling and executing many tiny tasks.
Date Fri, 17 Jul 2015 21:18:49 GMT
I use a similar pattern.

I have my own scheduler as you have. I deploy my own executor which
downloads a tar from some storage and effectively ` execvp ( ... ) ` a
proc. It monitors the child proc and reports status of child pid exit

Check out the Marathon code if you are writing in scala. It is an excellent
example for both scheduler and executor templates.


On Fri, Jul 17, 2015 at 5:06 PM, Philip Weaver <>

> Awesome, I suspected that was the case, but hadn't discovered the
> --allocation_interval flag, so I will use that.
> I installed from the mesosphere RPMs and didn't change any flags from
> there. I will try to find some logs that provide some insight into the
> execution times.
> I am using a command task. I haven't looked into executors yet; I had a
> hard time finding some examples in my language (Scala).
> On Fri, Jul 17, 2015 at 2:00 PM, Benjamin Mahler <
>> wrote:
>> One other thing, do you use an executor to run many tasks? Or are you
>> using a command task?
>> On Fri, Jul 17, 2015 at 1:54 PM, Benjamin Mahler <
>>> wrote:
>>> Currently, recovered resources are not immediately re-offered as you
>>> noticed, and the default allocation interval is 1 second. I'd recommend
>>> lowering that (e.g. --allocation_interval=50ms), that should improve the
>>> second bullet you listed. Although, in your case it would be better to
>>> immediately re-offer recovered resources (feel free to file a ticket for
>>> supporting that).
>>> For the first bullet, mind providing some more information? E.g. master
>>> flags, slave flags, scheduler logs, master logs, slave logs, executor logs?
>>> We would need to trace through a task launch to see where the latency is
>>> being introduced.
>>> On Fri, Jul 17, 2015 at 12:26 PM, Philip Weaver <
>>> > wrote:
>>>> I'm trying to understand the behavior of mesos, and if what I am
>>>> observing is typical or if I'm doing something wrong, and what options I
>>>> have for improving the performance of how offers are made and how tasks are
>>>> executed for my particular use case.
>>>> I have written a Scheduler that has a queue of very small tasks (for
>>>> testing, they are "echo hello world", but in production many of them won't
>>>> be much more expensive than that). Each task is configured to use 1 cpu
>>>> resource. When resourceOffers is called, I launch as many tasks as I can
>>>> the given offers; that is, one call to driver.launchTasks for each offer,
>>>> with a list of tasks that has one task for each cpu in that offer.
>>>> On a cluster of 3 nodes and 4 cores each (12 total cores), it takes
>>>> 120s to execute 1000 tasks out of the queue. We are evaluting mesos because
>>>> we want to use it to replace our current homegrown cluster controller,
>>>> which can execute 1000 tasks in way less than 120s.
>>>> I am seeing two things that concern me:
>>>>    - The time between driver.launchTasks and receiving a callback to
>>>>    statusUpdate when the task completes is typically 200-500ms, and sometimes
>>>>    even as high as 1000-2000ms.
>>>>    - The time between when a task completes and when I get an offer
>>>>    for the newly freed resource is another 500ms or so.
>>>> These latencies explain why I can only execute tasks at a rate of about
>>>> 8/s.
>>>> It looks like my offers always include all 4 cores on each machine,
>>>> which would indicate that mesos doesn't like to send an offer as soon as
>>>> single resource is avaiable, and prefers to delay and send an offer with
>>>> more resources in it. Is this true?
>>>> Thanks in advance for any advice you can offer!
>>>> - Phllip

View raw message