mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rad Gruchalski <ra...@gruchalski.com>
Subject Re: Tasks that run docker images consistently fail while downloading
Date Wed, 28 Oct 2015 10:26:07 GMT
Jim,  

Have you tried —task_launch_timeout? From: https://mesosphere.github.io/marathon/docs/native-docker.html

Configure Marathon
Increase the Marathon command line option (https://mesosphere.github.io/marathon/docs/command-line-flags.html)
--task_launch_timeout to at least the executor timeout, in milliseconds, you set on your slaves
in the previous step.













Kind regards,

Radek Gruchalski

radek@gruchalski.com (mailto:radek@gruchalski.com)
 (mailto:radek@gruchalski.com)
de.linkedin.com/in/radgruchalski/ (http://de.linkedin.com/in/radgruchalski/)

Confidentiality:
This communication is intended for the above-named person and may be confidential and/or legally
privileged.
If it has come to you in error you must take no action based on it, nor must you copy or show
it to anyone; please delete/destroy and inform the sender immediately.



On Wednesday, 28 October 2015 at 11:21, James Vanns wrote:

> Hi all.
>  
> Mesos version = 0.23.0-1.0.ubuntu1404 (mesosphere APT repo)
> Marathon version = 0.10.1 (mesosphere APT repo)
>  
> Hopefully this is a simple one for someone to answer, though I couldn't find anything
immediately  
> obvious in the documentation. We're trialling Mesos in a cloud (EC2/GCE) environment
and the one  
> thing that continues to bite us in the ass is this; continued task failures until the
docker image is  
> fully downloaded! Why is this!? Some of our images a small (say 200MB), some much larger
(2GB)  
> due to the nature of the software packages we're containerising. Regardless of this size,
they fail the  
> first dozen (or more) times until one of the slaves has pulled the image. Why is there
an apparent  
> hard time-out and how can I avoid it? I don't want the task to register as a fail - it
hasn't even had a  
> chance to run yet! Up until now we've just been tolerating the bouncing around of these
tasks but it's  
> now reached a point where it's darn annoying ;)
>  
> I've tried setting executor_registration_timeout to '5mins' but this made no apparent
difference (every  
> minute the task is killed still). I should note that these tasks are launched using the
Marathon  
> framework and I've tried setting 'task_launch_timeout' to '3000' and again, it makes
no difference.
>  
> Based on a brief glance of a mesos slave log file it seems the master instructs the slave
to kill the task off after 1 minute.
>  
> Please advise.
>  
> Cheers,
>  
> Jim
> --
> Senior Code Pig
> Industrial Light & Magic
>  
>  
>  
>  
>  



Mime
View raw message