accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Fuchs <afu...@apache.org>
Subject Re: Suggestions on modeling a composite row key
Date Wed, 27 Feb 2013 15:44:15 GMT
At sqrrl, we tend to use a Tuple class that implements List<String>
(List<ByteBuffer> would also work), and has conversions to and from
ByteBuffer. To encode the tuple into a byte buffer, change all the "\1"s to
"\1\2", change all the "\0"s to "\1\1", and put a "\0" byte between
elements. "\1" is used as an escape character for all of the "\1"s and
"\0"s appearing in the the unencoded form. To decode, just split on "\0"
and reverse the escaping. This encoding preserves hierarchical,
lexicographical ordering of tuple elements.

Cheers,
Adam



On Tue, Feb 26, 2013 at 11:51 PM, Mike Hugo <mike@piragua.com> wrote:

> I need to build up a row key that consists of two parts, the first being a
> URL (e.g. http://foo.com/dir/page%20name.htm) and the second being a
> number (e.g. "12").
>
> To date we've been using \u0000 to delimit these two pieces of the key,
> but that has some headaches associated with it.
>
> I'm curious to know how other people have delimited composite row keys.
>  Any best practices or suggestions?
>
> Thanks,
>
> Mike
>

Mime
View raw message