accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: Snappy as default table.file.compress.type?
Date Sun, 14 Aug 2016 03:41:58 GMT
That's a fair point. I'm off in nebulous vendor land and tend to be removed
from pure Apache Hadoop artifacts. I feel like there's a snappy package (at
least on centos) which is enough, but understanding this would be good.

Is there a nonnative snappy impl?

On Aug 13, 2016 11:19 PM, "Christopher" <ctubbsii@apache.org> wrote:

> Native libraries for snappy are also not typically installed by default on
> Linux distros. Even if the hadoop native libraries are installed, the user
> is likely going to end up using the Java implementation by default, I
> *think*, unless they take additional actions.
>
> On Sat, Aug 13, 2016 at 11:18 PM Adam Fuchs <afuchs@apache.org> wrote:
>
> > In my experience gz gets roughly 1.5x to 2x better compression than
> snappy.
> > Snappy is definitely not a pareto improvement (although we tend to use
> > snappy by default). Since it's not always better I think you would need a
> > more solid argument to change the default.
> >
> > Adam
> >
> > On Aug 13, 2016 8:06 PM, "Josh Elser" <josh.elser@gmail.com> wrote:
> >
> > > Same motivation of using it as for making it the default. I am not
> aware
> > > of any downside to it. It's become pretty standard across all
> > installations
> > > I've worked with for years.
> > >
> > > Asking because I am no oracle on the matter. I could just be ignorant
> of
> > > some issue, but, given my current understanding, there is no downside
> for
> > > the average case.
> > >
> > > Christopher wrote:
> > >
> > >> Sorry. I wasn't clear. I understand the motivation for using it... I'm
> > >> asking about the motivation for making it the default.
> > >>
> > >> Since both are available, I'm not sure the default matters *that*
> much,
> > >> but
> > >> it could be an unexpected change for those preferring GZ.
> > >>
> > >> Also, are there any risks regarding library availability of snappy? GZ
> > is
> > >> pretty ubiquitous.
> > >>
> > >> On Sat, Aug 13, 2016 at 10:59 PM Josh Elser<josh.elser@gmail.com>
> > wrote:
> > >>
> > >> Uhh, besides what I already mentioned? (close in compressed size but
> > >>> "much" faster)
> > >>>
> > >>> Christopher wrote:
> > >>>
> > >>>> What's the motivation for changing it?
> > >>>>
> > >>>> On Sat, Aug 13, 2016 at 10:47 PM Josh Elser<josh.elser@gmail.com>
> > >>>>
> > >>> wrote:
> > >>>
> > >>>> Any reason we don't want to do this? Last rule-of-thumb I heard
was
> > that
> > >>>>> snappy is often close enough in compression to GZ but quite
a bit
> > >>>>> faster
> > >>>>> (I don't remember exactly how much).
> > >>>>>
> > >>>>> - Josh
> > >>>>>
> > >>>>>
> > >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message