accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marc P." <marc.par...@gmail.com>
Subject Re: Snappy as default table.file.compress.type?
Date Sun, 14 Aug 2016 03:36:32 GMT
Perhaps there is a happy medium, though, by not necessarily defining
example configurations by the size of your memory footprint, but instead by
performance configuration? Snappy could be the default for those who want a
faster but less space cognizant implementation. Christopher's concerns
would be allayed, and perhaps those who try Accumulo may get better
performance by using Snappy?

On Sat, Aug 13, 2016 at 11:19 PM, Christopher <ctubbsii@apache.org> wrote:

> Native libraries for snappy are also not typically installed by default on
> Linux distros. Even if the hadoop native libraries are installed, the user
> is likely going to end up using the Java implementation by default, I
> *think*, unless they take additional actions.
>
> On Sat, Aug 13, 2016 at 11:18 PM Adam Fuchs <afuchs@apache.org> wrote:
>
> > In my experience gz gets roughly 1.5x to 2x better compression than
> snappy.
> > Snappy is definitely not a pareto improvement (although we tend to use
> > snappy by default). Since it's not always better I think you would need a
> > more solid argument to change the default.
> >
> > Adam
> >
> > On Aug 13, 2016 8:06 PM, "Josh Elser" <josh.elser@gmail.com> wrote:
> >
> > > Same motivation of using it as for making it the default. I am not
> aware
> > > of any downside to it. It's become pretty standard across all
> > installations
> > > I've worked with for years.
> > >
> > > Asking because I am no oracle on the matter. I could just be ignorant
> of
> > > some issue, but, given my current understanding, there is no downside
> for
> > > the average case.
> > >
> > > Christopher wrote:
> > >
> > >> Sorry. I wasn't clear. I understand the motivation for using it... I'm
> > >> asking about the motivation for making it the default.
> > >>
> > >> Since both are available, I'm not sure the default matters *that*
> much,
> > >> but
> > >> it could be an unexpected change for those preferring GZ.
> > >>
> > >> Also, are there any risks regarding library availability of snappy? GZ
> > is
> > >> pretty ubiquitous.
> > >>
> > >> On Sat, Aug 13, 2016 at 10:59 PM Josh Elser<josh.elser@gmail.com>
> > wrote:
> > >>
> > >> Uhh, besides what I already mentioned? (close in compressed size but
> > >>> "much" faster)
> > >>>
> > >>> Christopher wrote:
> > >>>
> > >>>> What's the motivation for changing it?
> > >>>>
> > >>>> On Sat, Aug 13, 2016 at 10:47 PM Josh Elser<josh.elser@gmail.com>
> > >>>>
> > >>> wrote:
> > >>>
> > >>>> Any reason we don't want to do this? Last rule-of-thumb I heard
was
> > that
> > >>>>> snappy is often close enough in compression to GZ but quite
a bit
> > >>>>> faster
> > >>>>> (I don't remember exactly how much).
> > >>>>>
> > >>>>> - Josh
> > >>>>>
> > >>>>>
> > >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message