accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Wall <mjw...@gmail.com>
Subject Re: Snappy as default table.file.compress.type?
Date Mon, 15 Aug 2016 15:16:10 GMT
I like the idea of making snappy the default.  However, I am concerned
about raising the barrier of entry to new users by adding yet another
dependency to install.

On Mon, Aug 15, 2016 at 11:13 AM, Josh Elser <josh.elser@gmail.com> wrote:

> No, I never asserted that Snappy is *always* the better choice. I would
> say that I believe Snappy is better in *most cases*.
>
> Most users I talk to (with and without Accumulo involved) have plenty of
> disk space available to them. It is rare that space on disk is actually a
> concern. Instead, performance is usually the primary metric of concern. To
> be crystal clear, this is only my opinion on users I've talked to, not an
> assertion on everyone.
>
> I do not believe I need a better argument than "on average, we can make
> out of the box performance better for most users". I suppose we'll have to
> disagree on that point. Thanks for clarifying your opinions on the topic.
>
>
> Adam Fuchs wrote:
>
>> If the crux of your argument was that snappy is always a better choice,
>> then my retort was to say it is not, since sometimes compression ratio can
>> be a dominant factor. Changes to defaults are disruptive for existing
>> users, so you need a better argument. I don't mean that you shouldn't
>> continue to debate the merits. By all means, do continue the conversation.
>>
>> Adam
>>
>> On Aug 13, 2016 8:39 PM, "Josh Elser"<josh.elser@gmail.com>  wrote:
>>
>>> Your argument fails to address the performance benefits. I could pose the
>>> same question back to you: you need to prove why we shouldn't use the
>>> faster compression algorithm.
>>>
>>> I don't mean to be snarky, but your argument is shutting down
>>>
>> conversation.
>>
>>> I appreciate you sharing the opinion but don't feel like it's encouraging
>>> discussion.
>>>
>>> On Aug 13, 2016 11:18 PM, "Adam Fuchs"<afuchs@apache.org>  wrote:
>>>
>>> In my experience gz gets roughly 1.5x to 2x better compression than
>>>>
>>> snappy.
>>
>>> Snappy is definitely not a pareto improvement (although we tend to use
>>>> snappy by default). Since it's not always better I think you would need
>>>>
>>> a
>>
>>> more solid argument to change the default.
>>>>
>>>> Adam
>>>>
>>>> On Aug 13, 2016 8:06 PM, "Josh Elser"<josh.elser@gmail.com>  wrote:
>>>>
>>>> Same motivation of using it as for making it the default. I am not
>>>>>
>>>> aware
>>
>>> of any downside to it. It's become pretty standard across all
>>>>>
>>>> installations
>>>>
>>>>> I've worked with for years.
>>>>>
>>>>> Asking because I am no oracle on the matter. I could just be ignorant
>>>>>
>>>> of
>>
>>> some issue, but, given my current understanding, there is no downside
>>>>>
>>>> for
>>
>>> the average case.
>>>>>
>>>>> Christopher wrote:
>>>>>
>>>>> Sorry. I wasn't clear. I understand the motivation for using it...
>>>>>>
>>>>> I'm
>>
>>> asking about the motivation for making it the default.
>>>>>>
>>>>>> Since both are available, I'm not sure the default matters *that*
>>>>>>
>>>>> much,
>>
>>> but
>>>>>> it could be an unexpected change for those preferring GZ.
>>>>>>
>>>>>> Also, are there any risks regarding library availability of snappy?
>>>>>>
>>>>> GZ
>>
>>> is
>>>>
>>>>> pretty ubiquitous.
>>>>>>
>>>>>> On Sat, Aug 13, 2016 at 10:59 PM Josh Elser<josh.elser@gmail.com>
>>>>>>
>>>>> wrote:
>>>>
>>>>> Uhh, besides what I already mentioned? (close in compressed size but
>>>>>>
>>>>>>> "much" faster)
>>>>>>>
>>>>>>> Christopher wrote:
>>>>>>>
>>>>>>> What's the motivation for changing it?
>>>>>>>>
>>>>>>>> On Sat, Aug 13, 2016 at 10:47 PM Josh Elser<josh.elser@gmail.com>
>>>>>>>>
>>>>>>>> wrote:
>>>>>>>
>>>>>>> Any reason we don't want to do this? Last rule-of-thumb I heard
was
>>>>>>>>
>>>>>>> that
>>>>
>>>>> snappy is often close enough in compression to GZ but quite a bit
>>>>>>>>> faster
>>>>>>>>> (I don't remember exactly how much).
>>>>>>>>>
>>>>>>>>> - Josh
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message