hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nasron Cheong <nasron.che...@kontagent.com>
Subject Re: Column qualifiers with hierarchy and filters
Date Wed, 06 Nov 2013 14:48:22 GMT
Yes, after some digging around, the key is to store integers as byte
representation, but more importantly to store them as big-endian so that
the lexicographical sequence is maintained.

Thanks!

- Nasron


On Tue, Nov 5, 2013 at 8:28 PM, Premal Shah <premal.j.shah@gmail.com> wrote:

> you can store the byte representation of the integer (fixed length) instead
> of the integer (which will be stored as strings of variable length) and
> will also be sorted.
>
>
> On Tue, Nov 5, 2013 at 1:58 PM, Nasron Cheong
> <nasron.cheong@kontagent.com>wrote:
>
> > Yes, its limited in the sense that we have to precalculate the number of
> > digits required so we don't run out, and if we overestimate, then our row
> > keys end up taking up more space than we'd care to.
> >
> > We can probably live with this approach for now, but I wonder if there's
> a
> > better way.
> >
> > - Nasron
> >
> >
> > On Tue, Nov 5, 2013 at 12:28 PM, Jean-Marc Spaggiari <
> > jean-marc@spaggiari.org> wrote:
> >
> > > Hi Nasron,
> > >
> > > Why are you saying that it's a limited way? Does it achieve your needs?
> > >
> > >
> > > 2013/11/4 Nasron Cheong <nasron.cheong@kontagent.com>
> > >
> > > > An example query would be the following, say the column qualifier was
> > of
> > > > the form
> > > >
> > > > <bucket #>:<msg type>
> > > >
> > > > where <bucket #> should be an integer value, and msg type is a
> string.
> > > E.g.
> > > >
> > > > 1:abc
> > > > 1000:abc
> > > > 2: abc
> > > >
> > > > would appear in the above sequence, which is out of order when doing
> > > prefix
> > > > filtering. Zero padding could fix this:
> > > >
> > > > 0001:abc
> > > > 0002:abc
> > > > 1000: abc
> > > >
> > > > But is a limited way of ensuring the sequence of CQ (column
> qualifiers)
> > > is
> > > > correct, in order for prefix filtering to work. Are there other
> > options?
> > > >
> > > > - Nasron
> > > >
> > > >
> > > > On Thu, Oct 31, 2013 at 9:19 PM, Nasron Cheong
> > > > <nasron.cheong@kontagent.com>wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I'm trying to determine the best way to serialize a sequence of
> > > > > integers/strings that represent a hierarchy for a column qualifier,
> > > which
> > > > > would be compatible with the ColumnPrefixFilters, and
> > > BinaryComparators.
> > > > >
> > > > > However, due to the lexicographical sorting, it's awkward to
> > serialize
> > > > the
> > > > > sequence of values needed to get it to work.
> > > > >
> > > > > What are the typical solutions to this? Do people just zero pad
> > > integers
> > > > > to make sure they sort correctly? Or do I have to implement my own
> > > > > QualifierFilter - which seems expensive since I'd be deserializing
> > > every
> > > > > byte array just to compare.
> > > > >
> > > > > Thanks
> > > > >
> > > > > - Nasron
> > > > >
> > > >
> > >
> >
>
>
>
> --
> Regards,
> Premal Shah.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message