accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Turner <ke...@deenlo.com>
Subject Re: How are the Column Visibility elements stored in Accumulo
Date Wed, 19 Feb 2014 16:40:12 GMT
On Tue, Feb 18, 2014 at 10:14 PM, Mike Drob <madrob@cloudera.com> wrote:

> The column visibility is stored as a bytes on disk, derived from the entire
> visibility expression. In theory this may seem like a lot of space, but in
> practice it turns out to be fine for a couple of reasons.
>
> First, RFiles employ relative key encoding, so if the visibility is the
> same in two consecutive keys, then the second one is simply omitted.
>

In 1.5, common prefixes in consecutive key fields may be compressed away.
In 1.4 the entire field had to match.

https://issues.apache.org/jira/browse/ACCUMULO-790


> Also, RFiles are use gz encoding by default. If you have a few similar
> (repeated) text strings to represent your visibilities, then they will
> compress very well.
>
> However, if you have lots of different visibilities, then you may not end
> up gaining much from the storage tricks we employ.
>
> Mike
>
>
> On Mon, Feb 17, 2014 at 10:30 PM, Sitaraman Vilayannur <
> vrsitaramanietflists@gmail.com> wrote:
>
> > Hi,
> >   How are the column visibility elements stored in Accumulo. Is there a
> > kind of compression that is used to save space or are all the elements
> for
> > each key value paired stored as is.
> >   A pointer to the region of the code that i should look at for the
> > implementation will also be helpful.
> > Thanks
> > Sitaraman
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message