accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adam J. Shook" <adamjsh...@gmail.com>
Subject Re: Map Lexicoder
Date Tue, 29 Dec 2015 21:53:47 GMT
Agreed, I came to the same conclusion while implementing.  The final result
that I have is a SortedMapLexicoder to avoid any comparisons going
haywire.  Additionally, would it be best to encode the map as an array of
keys followed by an array of values, or encode all key value pairs
back-to-back:

{ a : 1 , b : 2, c : 3 } encoded as

a1b2c3
-or-
abc123

Feels like I should be encoding a list of keys, then the list of values,
and then concatenating these two encoded byte arrays.  I think the end
solution will be to support both?  I'm having a hard time reconciling which
method is better, if any.  Hard to find some good examples of people who
are sorting a list of maps.

On Tue, Dec 29, 2015 at 2:47 PM, Keith Turner <keith@deenlo.com> wrote:

>
>
> On Mon, Dec 28, 2015 at 11:47 AM, Adam J. Shook <adamjshook@gmail.com>
> wrote:
>
>> Hello all,
>>
>> Any suggestions for using a Map Lexicoder (or implementing one)?  I am
>> currently using a new ListLexicoder(new PairLexicoder(some lexicoder, some
>> lexicoder), which is working for single maps.  However, when one of the
>> lexicoders in the Pair is itself a Map (and therefore another
>> ListLexicoder(PairLexicoder)), an exception is being thrown because
>> ArrayList is not Comparable.
>>
>
>
> Since maps do not have a well defined order of keys and values, comparison
> is tricky.   The purpose of Lexicoders is encode things in such a way that
> the lexicographical comparison of the serialized data is possible.  With a
> hashmap if I add the same data in the same order to two different hash map
> instances, its possible that when iterating over those maps I could see the
> data in different orders.   This could lead to two maps constructed in the
> same way at different times (like different JVMs with different
> implementations of HashMap) generating different data that compare as
> different.  Ideally comparison of the two would yield equality.
>
> Something like LinkedHashMap does not have this problem for the same
> insertion order.  If you want things to be comparable regardless of
> insertion order (which I think is more intuitive), then SortedMap seems
> like it would be a good candidate.  So maybe a SortedMapLexicoder would be
> a better thing to offer?
>
>
>> Regards,
>> --Adam
>>
>
>

Mime
View raw message