accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Parisi <m...@accumulo.net>
Subject Re: compressing values returned to scanner
Date Mon, 01 Oct 2012 20:26:36 GMT
Ameet, keys and values ( relative keys ) are extracted from a decompressor
stream. In the case of block compression (i.e. gz ), you would need to
return a block so the receiver can decompress it. Therefore, using existing
compression, as Slacum mentioned, then decompressing the value is likely
the best method.


On Mon, Oct 1, 2012 at 4:00 PM, William Slacum <
wilhelm.von.cloud@accumulo.net> wrote:

> Someone can correct me if I'm wrong, but I believe the file compression
> option you quoted is for the RFiles in HDFS. You can enable compression
> there and will still see some benefit even if you compress the values on
> ingest.
>
>
> On Mon, Oct 1, 2012 at 12:40 PM, ameet kini <ameetkini@gmail.com> wrote:
>
>> That is exactly my use case (ingest once, serve often, no server-side
>> iterators).
>>
>> And I'm doing pre-compression on ingest. I was just looking to do away
>> with app-level compression code. Not a biggie.
>>
>> Ameet
>>
>>
>> On Mon, Oct 1, 2012 at 3:32 PM, William Slacum <
>> wilhelm.von.cloud@accumulo.net> wrote:
>>
>>> If you aren't often looking at the data in the value on the tablet
>>> server (like in an iterator), you can also pre-compress your values on
>>> ingest.
>>>
>>>
>>> On Mon, Oct 1, 2012 at 12:19 PM, Marc Parisi <marc@accumulo.net> wrote:
>>>
>>>> You could compress the data in the value, and decompress the data upon
>>>> receipt by the scanner.
>>>>
>>>>
>>>> On Mon, Oct 1, 2012 at 3:03 PM, ameet kini <ameetkini@gmail.com> wrote:
>>>>
>>>>>
>>>>> My understanding of compression in Accumulo 1.4.1 is that it is on by
>>>>> default and that data is decompressed by the tablet server, so data on
the
>>>>> wire between server/client is decompressed. Is there a way to shift the
>>>>> decompression from happening on the server to the client? I have a use
case
>>>>> where each Value in my table is relatively large (~ 8MB) and I can benefit
>>>>> from compression over the wire. I don't have any server side iterators,
so
>>>>> the values don't need to be decompressed by the tablet server. Also,
each
>>>>> scan returns a few rows, so client-side decompression can be fast.
>>>>>
>>>>> The only way I can think of now is to disable compression on that
>>>>> table, and handle compression/decompression in the application. But if
>>>>> there is a way to do this in Accumulo, I'd prefer that.
>>>>>
>>>>> Thanks,
>>>>> Ameet
>>>>>
>>>>
>>>>
>>>
>>
>

Mime
View raw message