accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adam J. Shook" <adamjsh...@gmail.com>
Subject Re: Map Lexicoder
Date Mon, 28 Dec 2015 22:16:58 GMT
Hi Josh,

Thanks for the advice.  I'm with you on using the CQ and Value instead of
putting the whole map into a Value, but what I am working on is using the
relational model of mapping data to Accumulo and expects the value of the
cell to be in the Value.  Certainly some optimization opportunities by
using the 'better' ways for storing data in Accumulo, but I'd like to get
this working before diving into that rabbit hole.

A brief look at the ListLexicoder encodes each element of the list using a
sub-lexicoder and escapes each element (0x00 -> 0x01 0x01 and 0x01 -> 0x01
0x02).  The voodoo here escapes me a little (pun!), but it seems to be
enough to enable multi-dimensional arrays encoded by nesting ListLexicoders
(up to 4D, haven't tried a fifth dimension).  I would expect something
similar could be done using a Map.  Would a MapLexicoder be something worth
contributing to the project?  I'd be happy to give it a stab.

--Adam

On Mon, Dec 28, 2015 at 12:21 PM, Josh Elser <josh.elser@gmail.com> wrote:

> Looks like you would have to implement some kind of ComparableMap to be
> able to use the PairLexicoder (see that the parameterization requires both
> types in the Pair to implement Comparable). The Pair lexicoder requires
> these Comparable types to align itself with the original goal of the
> Lexicoders: provide byte-array serialization for types whose sort order
> matches the original object's ordering.
>
> Typically, when we have key to value style data we want to put in
> Accumulo, it makes sense to leverage the Column Qualifier and the Value,
> instead of serializing everything into one Accumulo Value. Iterators make
> it easy to do server-side predicates and transformations. My hunch is that
> this is another reason why you don't already see a MapLexicoder provided.
>
> One technical difficulty you might run into implementing a generalized
> MapLexicoder is how you delimit the key and value in one pair and how you
> delimit many pairs from each other. Commonly, the "null" byte (\x00) is
> used as a separator since it doesn't often appear in user-data. I'm not
> sure if some of the other Lexicoders already use this in their
> serialization (e.g. the ListLexicoder might, I haven't looked at the code).
> Nesting Lexicoders generically might be tricky (although not impossible) --
> thought it was worth mentioning to make sure you thought about it.
>
>
> Adam J. Shook wrote:
>
>> Hello all,
>>
>> Any suggestions for using a Map Lexicoder (or implementing one)?  I am
>> currently using a new ListLexicoder(new PairLexicoder(some lexicoder,
>> some lexicoder), which is working for single maps.  However, when one of
>> the lexicoders in the Pair is itself a Map (and therefore another
>> ListLexicoder(PairLexicoder)), an exception is being thrown because
>> ArrayList is not Comparable.
>>
>> Regards,
>> --Adam
>>
>

Mime
View raw message