Return-Path: X-Original-To: apmail-asterixdb-dev-archive@minotaur.apache.org Delivered-To: apmail-asterixdb-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 432DB18DF5 for ; Fri, 14 Aug 2015 00:14:15 +0000 (UTC) Received: (qmail 82215 invoked by uid 500); 14 Aug 2015 00:14:15 -0000 Delivered-To: apmail-asterixdb-dev-archive@asterixdb.apache.org Received: (qmail 82157 invoked by uid 500); 14 Aug 2015 00:14:15 -0000 Mailing-List: contact dev-help@asterixdb.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@asterixdb.incubator.apache.org Delivered-To: mailing list dev@asterixdb.incubator.apache.org Received: (qmail 82146 invoked by uid 99); 14 Aug 2015 00:14:15 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 Aug 2015 00:14:15 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id B08D01AA0D7 for ; Fri, 14 Aug 2015 00:14:14 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.995 X-Spam-Level: X-Spam-Status: No, score=0.995 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RP_MATCHES_RCVD=-0.006, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 3bkZCv_cMqsh for ; Fri, 14 Aug 2015 00:14:06 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with SMTP id 5F379210C8 for ; Fri, 14 Aug 2015 00:14:06 +0000 (UTC) Received: (qmail 81548 invoked by uid 99); 14 Aug 2015 00:14:06 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 Aug 2015 00:14:06 +0000 Received: from [10.18.153.44] (unknown [166.170.42.190]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 0D17B1A0325; Fri, 14 Aug 2015 00:14:06 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (1.0) Subject: Re: json vs. JSON From: Till Westmann X-Mailer: iPhone Mail (12H143) In-Reply-To: Date: Thu, 13 Aug 2015 17:13:58 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <298016D8-1E19-4E15-8CC8-7DEE28257D77@apache.org> References: <55BCF130.1090805@gmail.com> <5F1DCABB-A9D4-4EF1-88B4-85AE6053F40B@apache.org> <55C1B59C.6040400@gmail.com> <55C290EB.7030709@gmail.com> <55C59CE7.8020602@gmail.com> <1663CCAC-55A5-4B6A-8715-B79BB1605712@apache.org> To: "dev@asterixdb.incubator.apache.org" > On Aug 13, 2015, at 14:59, Chris Hillery wrote: >=20 >> On Wed, Aug 12, 2015 at 10:26 PM, Till Westmann wrote:= >>=20 >>=20 >> I really would like to get to a consistent set of rules on how we >> serialize ADM instances to JSON. >> My proposal for those rules is: >>=20 >> 1) structures are represented by JSON structures (objects and arrays) >> 2) values are represented by JSON values (string, number) >> 3) types that are not numeric are represented by a widely supported strin= g >> representation. >=20 > I agree with you for those types for which a widely-supported string > representation exists. There's no lack of different string representations for date and time, but w= e chose one which we believe to be widely-enough supported.=20 I don't know if that exists for the spatial types, and I'd like to find out.= =20 >=20 > If we invent our own structured representation, we might make things a >> little easier for people who manually craft their application for he firs= t >> time, but we make it harder for people who are already working in the >> domain and want to use AsterixDB to store their data. >=20 > That's only true if there is a widely-supported string representation that= > "everyone" who is working in that domain will be prepared to handle. The > only possible candidate we've seen is WKT, and it's highly unclear to me > that we understand that format well enough to claim to be able to generate= > it correctly. Plus, circle. >=20 > IMHO, if we want to offer WKT support, it makes sense to do that at a > library level, not implied by serialization. We shouldn't assume that > everyone who is using spatial types necessarily wants WKT. Think of it thi= s > way: By serializing to a basic JSON representation like my proposal, any > downstream consumers can easily get the data, while those who specifically= > want WKT can generate WKT strings from within the query (possibly using a > library we provide) and we'll serialize those directly. Both classes of > user can be happy. The reverse is not true. >=20 >=20 >> Also, if our support for spatial types differs significantly from the >> "usual" support, we should consider if we doing the right thing. I think >> that we don't want to tell people dealing with spatial data how to do it.= >> I'd like to support them by providing the right infrastructure. >=20 > This I completely agree with. But this is almost completely orthogonal to > any discussion of how we serialize ADM. The only way it relates to the > current discussion is that if we foresee some radical overhaul of ADM's > spatial types in the near future, we should stop spending time worrying > about how to serialize them. >=20 >=20 >> "location2d" : [41.0, 44.0], >>> "location3d" : [44.0, 13.0, 41.0], >>> "line" : [ [10.1, 11.1], [10.2, 11.2] ], >>> "rectangle" : [ [5.1, 11.8], [87.6, 15.6548] ], >>> "polygon" : [ [1.2, 1.3], [2.1, 2.5], [3.5, 3.6], [4.6, 4.8] ], >>> "circle" : { "radius" : 10.1, "center" : [ 11.1, 10.2 ] }, >=20 >> The things about this format is, that it's really difficult to see (for >> humans or parsers) what spatial types are represented by these nested >> arrays. >=20 >=20 > In most cases, these spatial types are going to be used as values in an > object, and the corresponding JSON name will provide context. It'll be > something like >=20 > { > "tweet" : { > "userid" : "tillw", > "message" : "hello world", > "geolocation" : [44.0, -3.7] > } > } >=20 > For line, rectangle, polygon, and circle, I also suggested a more verbose > format which names the components of the value; I'm happy with that as > well. If that's not self-describing enough either, then I would suggest > that we simply use the existing non-lossy JSON form for serializing spatia= l > types. Indeed, that's how I'm moving forward with the implementation right= > now. I think that makes a lot of sense. It seems that we (or at least I) don't re= ally understand the space well enough to come up with a good alternative. An= d so - as you said - we probably should not spend too much time on implement= ing something.=20 Cheers, Till