Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4450B18DA3 for ; Tue, 29 Dec 2015 21:54:47 +0000 (UTC) Received: (qmail 46015 invoked by uid 500); 29 Dec 2015 21:54:42 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 45955 invoked by uid 500); 29 Dec 2015 21:54:42 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 45945 invoked by uid 99); 29 Dec 2015 21:54:42 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Dec 2015 21:54:42 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 92C1718050C for ; Tue, 29 Dec 2015 21:54:41 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.899 X-Spam-Level: ** X-Spam-Status: No, score=2.899 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id zEoONREcfIoV for ; Tue, 29 Dec 2015 21:54:34 +0000 (UTC) Received: from mail-vk0-f44.google.com (mail-vk0-f44.google.com [209.85.213.44]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id C48052160F for ; Tue, 29 Dec 2015 21:54:33 +0000 (UTC) Received: by mail-vk0-f44.google.com with SMTP id f2so158353616vkb.3 for ; Tue, 29 Dec 2015 13:54:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=JvYQViYAVhNIMG2H/MSAZAuZpNBcpZxhq7troKXLauc=; b=bZMUm+t9kkAe/+47of3XTgd9fXKi4rLUVypCLetRoQnrJzHjEWqQsrj14Zz1QsF8hP U+5EJpIJLn0P4KU4VhMZKLZu0CVGGUNJ3PJKlU6G0i/l4sKWfRwTxF+pOcPN/K0LFZWO zOAJeOKhrMFwzc0HuET6fJ5MMWsQBdcS2nyxXxEVP5dOwpJV0kdcXlZpoyyiK5YZIUZe ajusJiH9kIYzsE0CqwgOvVs3OGszVy4Ig3VaBZIJp1+uBV+yBb+f2qxNr8U5FIyKo4gq F+P+6LGIj7LYBXoXA5yqt6QttMd1dYFN4ZoJaHhao5dS8uml9EXggY7iDMekHhxNtQMU TzCQ== X-Received: by 10.31.146.66 with SMTP id u63mr35361929vkd.31.1451426067531; Tue, 29 Dec 2015 13:54:27 -0800 (PST) MIME-Version: 1.0 Received: by 10.31.64.130 with HTTP; Tue, 29 Dec 2015 13:53:47 -0800 (PST) In-Reply-To: References: From: "Adam J. Shook" Date: Tue, 29 Dec 2015 16:53:47 -0500 Message-ID: Subject: Re: Map Lexicoder To: user@accumulo.apache.org Content-Type: multipart/alternative; boundary=001a1143a94c2c4bc80528107706 --001a1143a94c2c4bc80528107706 Content-Type: text/plain; charset=UTF-8 Agreed, I came to the same conclusion while implementing. The final result that I have is a SortedMapLexicoder to avoid any comparisons going haywire. Additionally, would it be best to encode the map as an array of keys followed by an array of values, or encode all key value pairs back-to-back: { a : 1 , b : 2, c : 3 } encoded as a1b2c3 -or- abc123 Feels like I should be encoding a list of keys, then the list of values, and then concatenating these two encoded byte arrays. I think the end solution will be to support both? I'm having a hard time reconciling which method is better, if any. Hard to find some good examples of people who are sorting a list of maps. On Tue, Dec 29, 2015 at 2:47 PM, Keith Turner wrote: > > > On Mon, Dec 28, 2015 at 11:47 AM, Adam J. Shook > wrote: > >> Hello all, >> >> Any suggestions for using a Map Lexicoder (or implementing one)? I am >> currently using a new ListLexicoder(new PairLexicoder(some lexicoder, some >> lexicoder), which is working for single maps. However, when one of the >> lexicoders in the Pair is itself a Map (and therefore another >> ListLexicoder(PairLexicoder)), an exception is being thrown because >> ArrayList is not Comparable. >> > > > Since maps do not have a well defined order of keys and values, comparison > is tricky. The purpose of Lexicoders is encode things in such a way that > the lexicographical comparison of the serialized data is possible. With a > hashmap if I add the same data in the same order to two different hash map > instances, its possible that when iterating over those maps I could see the > data in different orders. This could lead to two maps constructed in the > same way at different times (like different JVMs with different > implementations of HashMap) generating different data that compare as > different. Ideally comparison of the two would yield equality. > > Something like LinkedHashMap does not have this problem for the same > insertion order. If you want things to be comparable regardless of > insertion order (which I think is more intuitive), then SortedMap seems > like it would be a good candidate. So maybe a SortedMapLexicoder would be > a better thing to offer? > > >> Regards, >> --Adam >> > > --001a1143a94c2c4bc80528107706 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Agreed, I came to the same conclusion while implementing.= =C2=A0 The final result that I have is a SortedMapLexicoder to avoid any co= mparisons going haywire.=C2=A0 Additionally, would it be best to encode the= map as an array of keys followed by an array of values, or encode all key = value pairs back-to-back:

{ a : 1 , b : 2, c : 3 } encoded as<= /div>

a1b= 2c3
-or-
= abc123
=
Feels like I should be encoding a list of keys, then the lis= t of values, and then concatenating these two encoded byte arrays.=C2=A0 I = think the end solution will be to support both?=C2=A0 I'm having a hard= time reconciling which method is better, if any.=C2=A0 Hard to find some g= ood examples of people who are sorting a list of maps.

On Tue, Dec 29, 2015 at 2:= 47 PM, Keith Turner <keith@deenlo.com> wrote:


On Mon, Dec 28, 2015 at 11:47 AM, Adam J. Shoo= k <adamjshook@gmail.com> wrote:
Hello all,

Any suggestions for= using a Map Lexicoder (or implementing one)?=C2=A0 I am currently using a = new ListLexicoder(new PairLexicoder(some lexicoder, some lexicoder), which = is working for single maps.=C2=A0 However, when one of the lexicoders in th= e Pair is itself a Map (and therefore another ListLexicoder(PairLexicoder))= , an exception is being thrown because ArrayList is not Comparable.


Since maps do not have a we= ll defined order of keys and values, comparison is tricky.=C2=A0=C2=A0 The = purpose of Lexicoders is encode things in such a way that the lexicographic= al comparison of the serialized data is possible.=C2=A0 With a hashmap if I= add the same data in the same order to two different hash map instances, i= ts possible that when iterating over those maps I could see the data in dif= ferent orders. =C2=A0 This could lead to two maps constructed in the same w= ay at different times (like different JVMs with different implementations o= f HashMap) generating different data that compare as different.=C2=A0 Ideal= ly comparison of the two would yield equality.

Something like Linked= HashMap does not have this problem for the same insertion order.=C2=A0 If y= ou want things to be comparable regardless of insertion order (which I thin= k is more intuitive), then SortedMap seems like it would be a good candidat= e.=C2=A0 So maybe a SortedMapLexicoder would be a better thing to offer?

<= div>Regards,
--Adam


--001a1143a94c2c4bc80528107706--