asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hillery <chill...@hillery.land>
Subject Re: json vs. JSON
Date Thu, 13 Aug 2015 21:59:37 GMT
On Wed, Aug 12, 2015 at 10:26 PM, Till Westmann <tillw@apache.org> wrote:

>
> I really would like to get to a consistent set of rules on how we
> serialize ADM instances to JSON.
> My proposal for those rules is:
>
> 1) structures are represented by JSON structures (objects and arrays)
> 2) values are represented by JSON values (string, number)
> 3) types that are not numeric are represented by a widely supported string
> representation.
>

I agree with you for those types for which a widely-supported string
representation exists.

If we invent our own structured representation, we might make things a
> little easier for people who manually craft their application for he first
> time, but we make it harder for people who are already working in the
> domain and want to use AsterixDB to store their data.
>

That's only true if there is a widely-supported string representation that
"everyone" who is working in that domain will be prepared to handle. The
only possible candidate we've seen is WKT, and it's highly unclear to me
that we understand that format well enough to claim to be able to generate
it correctly. Plus, circle.

IMHO, if we want to offer WKT support, it makes sense to do that at a
library level, not implied by serialization. We shouldn't assume that
everyone who is using spatial types necessarily wants WKT. Think of it this
way: By serializing to a basic JSON representation like my proposal, any
downstream consumers can easily get the data, while those who specifically
want WKT can generate WKT strings from within the query (possibly using a
library we provide) and we'll serialize those directly. Both classes of
user can be happy. The reverse is not true.


> Also, if our support for spatial types differs significantly from the
> "usual" support, we should consider if we doing the right thing. I think
> that we don't want to tell people dealing with spatial data how to do it.
> I'd like to support them by providing the right infrastructure.
>

This I completely agree with. But this is almost completely orthogonal to
any discussion of how we serialize ADM. The only way it relates to the
current discussion is that if we foresee some radical overhaul of ADM's
spatial types in the near future, we should stop spending time worrying
about how to serialize them.


> "location2d" : [41.0, 44.0],
>> "location3d" : [44.0, 13.0, 41.0],
>> "line" : [ [10.1, 11.1], [10.2, 11.2] ],
>> "rectangle" : [ [5.1, 11.8], [87.6, 15.6548] ],
>> "polygon" : [ [1.2, 1.3], [2.1, 2.5], [3.5, 3.6], [4.6, 4.8] ],
>> "circle" : { "radius" : 10.1, "center" : [ 11.1, 10.2 ] },
>>
>

> The things about this format is, that it's really difficult to see (for
> humans or parsers) what spatial types are represented by these nested
> arrays.


In most cases, these spatial types are going to be used as values in an
object, and the corresponding JSON name will provide context. It'll be
something like

  {
    "tweet" : {
      "userid" : "tillw",
      "message" : "hello world",
      "geolocation" : [44.0, -3.7]
    }
  }

For line, rectangle, polygon, and circle, I also suggested a more verbose
format which names the components of the value; I'm happy with that as
well. If that's not self-describing enough either, then I would suggest
that we simply use the existing non-lossy JSON form for serializing spatial
types. Indeed, that's how I'm moving forward with the implementation right
now.

Ceej
aka Chris Hillery

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message