asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Till Westmann <ti...@apache.org>
Subject Re: json vs. JSON
Date Fri, 14 Aug 2015 00:13:58 GMT


> On Aug 13, 2015, at 14:59, Chris Hillery <chillery@hillery.land> wrote:
> 
>> On Wed, Aug 12, 2015 at 10:26 PM, Till Westmann <tillw@apache.org> wrote:
>> 
>> 
>> I really would like to get to a consistent set of rules on how we
>> serialize ADM instances to JSON.
>> My proposal for those rules is:
>> 
>> 1) structures are represented by JSON structures (objects and arrays)
>> 2) values are represented by JSON values (string, number)
>> 3) types that are not numeric are represented by a widely supported string
>> representation.
> 
> I agree with you for those types for which a widely-supported string
> representation exists.

There's no lack of different string representations for date and time, but we chose one which
we believe to be widely-enough supported. 
I don't know if that exists for the spatial types, and I'd like to find out. 

> 
> If we invent our own structured representation, we might make things a
>> little easier for people who manually craft their application for he first
>> time, but we make it harder for people who are already working in the
>> domain and want to use AsterixDB to store their data.
> 
> That's only true if there is a widely-supported string representation that
> "everyone" who is working in that domain will be prepared to handle. The
> only possible candidate we've seen is WKT, and it's highly unclear to me
> that we understand that format well enough to claim to be able to generate
> it correctly. Plus, circle.
> 
> IMHO, if we want to offer WKT support, it makes sense to do that at a
> library level, not implied by serialization. We shouldn't assume that
> everyone who is using spatial types necessarily wants WKT. Think of it this
> way: By serializing to a basic JSON representation like my proposal, any
> downstream consumers can easily get the data, while those who specifically
> want WKT can generate WKT strings from within the query (possibly using a
> library we provide) and we'll serialize those directly. Both classes of
> user can be happy. The reverse is not true.
> 
> 
>> Also, if our support for spatial types differs significantly from the
>> "usual" support, we should consider if we doing the right thing. I think
>> that we don't want to tell people dealing with spatial data how to do it.
>> I'd like to support them by providing the right infrastructure.
> 
> This I completely agree with. But this is almost completely orthogonal to
> any discussion of how we serialize ADM. The only way it relates to the
> current discussion is that if we foresee some radical overhaul of ADM's
> spatial types in the near future, we should stop spending time worrying
> about how to serialize them.
> 
> 
>> "location2d" : [41.0, 44.0],
>>> "location3d" : [44.0, 13.0, 41.0],
>>> "line" : [ [10.1, 11.1], [10.2, 11.2] ],
>>> "rectangle" : [ [5.1, 11.8], [87.6, 15.6548] ],
>>> "polygon" : [ [1.2, 1.3], [2.1, 2.5], [3.5, 3.6], [4.6, 4.8] ],
>>> "circle" : { "radius" : 10.1, "center" : [ 11.1, 10.2 ] },
> 
>> The things about this format is, that it's really difficult to see (for
>> humans or parsers) what spatial types are represented by these nested
>> arrays.
> 
> 
> In most cases, these spatial types are going to be used as values in an
> object, and the corresponding JSON name will provide context. It'll be
> something like
> 
>  {
>    "tweet" : {
>      "userid" : "tillw",
>      "message" : "hello world",
>      "geolocation" : [44.0, -3.7]
>    }
>  }
> 
> For line, rectangle, polygon, and circle, I also suggested a more verbose
> format which names the components of the value; I'm happy with that as
> well. If that's not self-describing enough either, then I would suggest
> that we simply use the existing non-lossy JSON form for serializing spatial
> types. Indeed, that's how I'm moving forward with the implementation right
> now.

I think that makes a lot of sense. It seems that we (or at least I) don't really understand
the space well enough to come up with a good alternative. And so - as you said - we probably
should not spend too much time on implementing something. 

Cheers,
Till



Mime
View raw message