spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Davidson <ilike...@gmail.com>
Subject Re: better compression codecs for shuffle blocks?
Date Tue, 15 Jul 2014 00:06:17 GMT
One of the core problems here is the number of open streams we have, which
is (# cores * # reduce partitions), which can easily climb into the tens of
thousands for large jobs. This is a more general problem that we are
planning on fixing for our largest shuffles, as even moderate buffer sizes
can explode to use huge amounts of memory at that scale.


On Mon, Jul 14, 2014 at 4:53 PM, Jon Hartlaub <jhartlaub@gmail.com> wrote:

> Is the held memory due to just instantiating the LZFOutputStream?  If so,
> I'm a surprised and I consider that a bug.
>
> I suspect the held memory may be due to a SoftReference - memory will be
> released with enough memory pressure.
>
> Finally, is it necessary to keep 1000 (or more) decoders active?  Would it
> be possible to keep an object pool of encoders and check them in and out as
> needed?  I admit I have not done much homework to determine if this is
> viable.
>
> -Jon
>
>
> On Mon, Jul 14, 2014 at 4:08 PM, Reynold Xin <rxin@databricks.com> wrote:
>
> > Copying Jon here since he worked on the lzf library at Ning.
> >
> > Jon - any comments on this topic?
> >
> >
> > On Mon, Jul 14, 2014 at 3:54 PM, Matei Zaharia <matei.zaharia@gmail.com>
> > wrote:
> >
> >> You can actually turn off shuffle compression by setting
> >> spark.shuffle.compress to false. Try that out, there will still be some
> >> buffers for the various OutputStreams, but they should be smaller.
> >>
> >> Matei
> >>
> >> On Jul 14, 2014, at 3:30 PM, Stephen Haberman <
> stephen.haberman@gmail.com>
> >> wrote:
> >>
> >> >
> >> > Just a comment from the peanut gallery, but these buffers are a real
> >> > PITA for us as well. Probably 75% of our non-user-error job failures
> >> > are related to them.
> >> >
> >> > Just naively, what about not doing compression on the fly? E.g. during
> >> > the shuffle just write straight to disk, uncompressed?
> >> >
> >> > For us, we always have plenty of disk space, and if you're concerned
> >> > about network transmission, you could add a separate compress step
> >> > after the blocks have been written to disk, but before being sent over
> >> > the wire.
> >> >
> >> > Granted, IANAE, so perhaps this is a bad idea; either way, awesome to
> >> > see work in this area!
> >> >
> >> > - Stephen
> >> >
> >>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message