Return-Path: X-Original-To: apmail-asterixdb-dev-archive@minotaur.apache.org Delivered-To: apmail-asterixdb-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EE6361884F for ; Thu, 13 Aug 2015 22:00:09 +0000 (UTC) Received: (qmail 24880 invoked by uid 500); 13 Aug 2015 22:00:09 -0000 Delivered-To: apmail-asterixdb-dev-archive@asterixdb.apache.org Received: (qmail 24811 invoked by uid 500); 13 Aug 2015 22:00:09 -0000 Mailing-List: contact dev-help@asterixdb.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@asterixdb.incubator.apache.org Delivered-To: mailing list dev@asterixdb.incubator.apache.org Received: (qmail 24792 invoked by uid 99); 13 Aug 2015 22:00:09 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Aug 2015 22:00:09 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id F19A319ADCE for ; Thu, 13 Aug 2015 22:00:08 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.98 X-Spam-Level: ** X-Spam-Status: No, score=2.98 tagged_above=-999 required=6.31 tests=[HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id 92qxrteuHN7V for ; Thu, 13 Aug 2015 22:00:06 +0000 (UTC) Received: from mail-la0-f48.google.com (mail-la0-f48.google.com [209.85.215.48]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id 7BA5A42B42 for ; Thu, 13 Aug 2015 22:00:05 +0000 (UTC) Received: by lagz9 with SMTP id z9so33845114lag.3 for ; Thu, 13 Aug 2015 14:59:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:content-type; bh=UosfdgtGPFFFOQkpW0i36iKhQ8I/Qb9tP0uLrYu/lL8=; b=hQmm1I5kAUROXO99Ln5ohfcVW+HzzMzopSeVpNdawcVbMQsHil+/Wuc2DN1Piwg8B8 Ng2VBYZ/+VYBGNwq0BH/wjjsxZHxY+f5K9Zd9hR+BrC2APMxig+UQBSuM3UG/WNk3ePd ZKy+hSrQU+YrJYQa45hy/o4PPmj5BlX7j6lXFeToglnBEhEeIl4FjrNag8F6e2RmqeXn ShkaqVo6U6OGknBM4B4kQvrz7Zg54qf6VdcEk4/QC5kyKrutfoBFCH0h+q4zRmRLVZLT c0VBhA9850tCHWlZ3up0jhYNmO0jgLDE/nnq5jtVNAR1qcOOr0n3G+qccdJBMhZKpA1c eB8w== X-Gm-Message-State: ALoCoQksVoR5FdP/1Z4jpd9saRnWKyLoTWRpZ+CBcnX3bZ1gfk5gNAvvZuDhS0MTVtUNucXweA+3 X-Received: by 10.153.7.137 with SMTP id dc9mr38229177lad.16.1439503196943; Thu, 13 Aug 2015 14:59:56 -0700 (PDT) MIME-Version: 1.0 Sender: ceej@lambda.nu Received: by 10.25.91.143 with HTTP; Thu, 13 Aug 2015 14:59:37 -0700 (PDT) X-Originating-IP: [69.62.207.190] In-Reply-To: <1663CCAC-55A5-4B6A-8715-B79BB1605712@apache.org> References: <55BCF130.1090805@gmail.com> <5F1DCABB-A9D4-4EF1-88B4-85AE6053F40B@apache.org> <55C1B59C.6040400@gmail.com> <55C290EB.7030709@gmail.com> <55C59CE7.8020602@gmail.com> <1663CCAC-55A5-4B6A-8715-B79BB1605712@apache.org> From: Chris Hillery Date: Thu, 13 Aug 2015 14:59:37 -0700 X-Google-Sender-Auth: 1WLQTt0jv5dLuTrdWw22Ra1ZkEg Message-ID: Subject: Re: json vs. JSON To: dev@asterixdb.incubator.apache.org Content-Type: multipart/alternative; boundary=001a113462a4b50e05051d387435 --001a113462a4b50e05051d387435 Content-Type: text/plain; charset=UTF-8 On Wed, Aug 12, 2015 at 10:26 PM, Till Westmann wrote: > > I really would like to get to a consistent set of rules on how we > serialize ADM instances to JSON. > My proposal for those rules is: > > 1) structures are represented by JSON structures (objects and arrays) > 2) values are represented by JSON values (string, number) > 3) types that are not numeric are represented by a widely supported string > representation. > I agree with you for those types for which a widely-supported string representation exists. If we invent our own structured representation, we might make things a > little easier for people who manually craft their application for he first > time, but we make it harder for people who are already working in the > domain and want to use AsterixDB to store their data. > That's only true if there is a widely-supported string representation that "everyone" who is working in that domain will be prepared to handle. The only possible candidate we've seen is WKT, and it's highly unclear to me that we understand that format well enough to claim to be able to generate it correctly. Plus, circle. IMHO, if we want to offer WKT support, it makes sense to do that at a library level, not implied by serialization. We shouldn't assume that everyone who is using spatial types necessarily wants WKT. Think of it this way: By serializing to a basic JSON representation like my proposal, any downstream consumers can easily get the data, while those who specifically want WKT can generate WKT strings from within the query (possibly using a library we provide) and we'll serialize those directly. Both classes of user can be happy. The reverse is not true. > Also, if our support for spatial types differs significantly from the > "usual" support, we should consider if we doing the right thing. I think > that we don't want to tell people dealing with spatial data how to do it. > I'd like to support them by providing the right infrastructure. > This I completely agree with. But this is almost completely orthogonal to any discussion of how we serialize ADM. The only way it relates to the current discussion is that if we foresee some radical overhaul of ADM's spatial types in the near future, we should stop spending time worrying about how to serialize them. > "location2d" : [41.0, 44.0], >> "location3d" : [44.0, 13.0, 41.0], >> "line" : [ [10.1, 11.1], [10.2, 11.2] ], >> "rectangle" : [ [5.1, 11.8], [87.6, 15.6548] ], >> "polygon" : [ [1.2, 1.3], [2.1, 2.5], [3.5, 3.6], [4.6, 4.8] ], >> "circle" : { "radius" : 10.1, "center" : [ 11.1, 10.2 ] }, >> > > The things about this format is, that it's really difficult to see (for > humans or parsers) what spatial types are represented by these nested > arrays. In most cases, these spatial types are going to be used as values in an object, and the corresponding JSON name will provide context. It'll be something like { "tweet" : { "userid" : "tillw", "message" : "hello world", "geolocation" : [44.0, -3.7] } } For line, rectangle, polygon, and circle, I also suggested a more verbose format which names the components of the value; I'm happy with that as well. If that's not self-describing enough either, then I would suggest that we simply use the existing non-lossy JSON form for serializing spatial types. Indeed, that's how I'm moving forward with the implementation right now. Ceej aka Chris Hillery --001a113462a4b50e05051d387435--