spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <matei.zaha...@gmail.com>
Subject Re: Mesos/Spark Deadlock
Date Mon, 25 Aug 2014 19:02:39 GMT
BTW it seems to me that even without that patch, you should be getting tasks launched as long
as you leave at least 32 MB of memory free on each machine (that is, the sum of the executor
memory sizes is not exactly the same as the total size of the machine). Then Mesos will be
able to re-offer that machine whenever CPUs free up.

Matei

On August 25, 2014 at 5:05:56 AM, Gary Malouf (malouf.gary@gmail.com) wrote:

We have not tried the work-around because there are other bugs in there 
that affected our set-up, though it seems it would help. 


On Mon, Aug 25, 2014 at 12:54 AM, Timothy Chen <tnachen@gmail.com> wrote: 

> +1 to have the work around in. 
> 
> I'll be investigating from the Mesos side too. 
> 
> Tim 
> 
> On Sun, Aug 24, 2014 at 9:52 PM, Matei Zaharia <matei.zaharia@gmail.com> 
> wrote: 
> > Yeah, Mesos in coarse-grained mode probably wouldn't work here. It's too 
> bad that this happens in fine-grained mode -- would be really good to fix. 
> I'll see if we can get the workaround in 
> https://github.com/apache/spark/pull/1860 into Spark 1.1. Incidentally 
> have you tried that? 
> > 
> > Matei 
> > 
> > On August 23, 2014 at 4:30:27 PM, Gary Malouf (malouf.gary@gmail.com) 
> wrote: 
> > 
> > Hi Matei, 
> > 
> > We have an analytics team that uses the cluster on a daily basis. They 
> use two types of 'run modes': 
> > 
> > 1) For running actual queries, they set the spark.executor.memory to 
> something between 4 and 8GB of RAM/worker. 
> > 
> > 2) A shell that takes a minimal amount of memory on workers (128MB) for 
> prototyping out a larger query. This allows them to not take up RAM on the 
> cluster when they do not really need it. 
> > 
> > We see the deadlocks when there are a few shells in either case. From 
> the usage patterns we have, coarse-grained mode would be a challenge as we 
> have to constantly remind people to kill their shells as soon as their 
> queries finish. 
> > 
> > Am I correct in viewing Mesos in coarse-grained mode as being similar to 
> Spark Standalone's cpu allocation behavior? 
> > 
> > 
> > 
> > 
> > On Sat, Aug 23, 2014 at 7:16 PM, Matei Zaharia <matei.zaharia@gmail.com> 
> wrote: 
> > Hey Gary, just as a workaround, note that you can use Mesos in 
> coarse-grained mode by setting spark.mesos.coarse=true. Then it will hold 
> onto CPUs for the duration of the job. 
> > 
> > Matei 
> > 
> > On August 23, 2014 at 7:57:30 AM, Gary Malouf (malouf.gary@gmail.com) 
> wrote: 
> > 
> > I just wanted to bring up a significant Mesos/Spark issue that makes the 
> > combo difficult to use for teams larger than 4-5 people. It's covered in 
> > https://issues.apache.org/jira/browse/MESOS-1688. My understanding is 
> that 
> > Spark's use of executors in fine-grained mode is a very different 
> behavior 
> > than many of the other common frameworks for Mesos. 
> > 
> 

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message