accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher <ctubb...@apache.org>
Subject Re: How are the Column Visibility elements stored in Accumulo
Date Wed, 19 Feb 2014 20:32:27 GMT
Even in 1.4, though, GZip helps.

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Wed, Feb 19, 2014 at 11:40 AM, Keith Turner <keith@deenlo.com> wrote:
> On Tue, Feb 18, 2014 at 10:14 PM, Mike Drob <madrob@cloudera.com> wrote:
>
>> The column visibility is stored as a bytes on disk, derived from the entire
>> visibility expression. In theory this may seem like a lot of space, but in
>> practice it turns out to be fine for a couple of reasons.
>>
>> First, RFiles employ relative key encoding, so if the visibility is the
>> same in two consecutive keys, then the second one is simply omitted.
>>
>
> In 1.5, common prefixes in consecutive key fields may be compressed away.
> In 1.4 the entire field had to match.
>
> https://issues.apache.org/jira/browse/ACCUMULO-790
>
>
>> Also, RFiles are use gz encoding by default. If you have a few similar
>> (repeated) text strings to represent your visibilities, then they will
>> compress very well.
>>
>> However, if you have lots of different visibilities, then you may not end
>> up gaining much from the storage tricks we employ.
>>
>> Mike
>>
>>
>> On Mon, Feb 17, 2014 at 10:30 PM, Sitaraman Vilayannur <
>> vrsitaramanietflists@gmail.com> wrote:
>>
>> > Hi,
>> >   How are the column visibility elements stored in Accumulo. Is there a
>> > kind of compression that is used to save space or are all the elements
>> for
>> > each key value paired stored as is.
>> >   A pointer to the region of the code that i should look at for the
>> > implementation will also be helpful.
>> > Thanks
>> > Sitaraman
>> >
>>

Mime
View raw message