Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 276DB200CBD for ; Wed, 21 Jun 2017 23:08:10 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 266BA160BD5; Wed, 21 Jun 2017 21:08:10 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 19D73160BD0 for ; Wed, 21 Jun 2017 23:08:08 +0200 (CEST) Received: (qmail 20279 invoked by uid 500); 21 Jun 2017 21:08:08 -0000 Mailing-List: contact users-help@asterixdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@asterixdb.apache.org Delivered-To: mailing list users@asterixdb.apache.org Received: (qmail 20208 invoked by uid 99); 21 Jun 2017 21:08:08 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Jun 2017 21:08:08 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 83D34CF074; Wed, 21 Jun 2017 21:08:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.989 X-Spam-Level: * X-Spam-Status: No, score=1.989 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=2, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id sl85esvdP_u6; Wed, 21 Jun 2017 21:08:05 +0000 (UTC) Received: from send.cs.ucr.edu (send.cs.ucr.edu [169.235.30.36]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id B113B5FB71; Wed, 21 Jun 2017 21:08:04 +0000 (UTC) Received: from mail-qk0-f169.google.com (mail-qk0-f169.google.com [209.85.220.169]) (using TLSv1 with cipher ECDHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by send.cs.ucr.edu (Postfix) with ESMTPSA id C280321B68; Wed, 21 Jun 2017 14:07:56 -0700 (PDT) Received: by mail-qk0-f169.google.com with SMTP id d14so92952378qkb.1; Wed, 21 Jun 2017 14:07:57 -0700 (PDT) X-Gm-Message-State: AKS2vOwMqp7PqNOgBj/hex5tC82NXbgKzxirTrwRH1SI3cxh4KWTr2Ma 9NTiyX1FG0SH5HzkK5xMW7ChOAnhuA== X-Received: by 10.233.232.205 with SMTP id a196mr44658232qkg.238.1498079276419; Wed, 21 Jun 2017 14:07:56 -0700 (PDT) MIME-Version: 1.0 Received: by 10.140.30.164 with HTTP; Wed, 21 Jun 2017 14:07:26 -0700 (PDT) In-Reply-To: References: <306fb3cb-06bb-2ec2-9ba1-e669a45390bd@gmail.com> From: Ahmed Eldawy Date: Wed, 21 Jun 2017 14:07:26 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Parse GeoJSON data into a record in AsterixDB To: dev@asterixdb.apache.org Cc: users@asterixdb.apache.org Content-Type: multipart/alternative; boundary="94eb2c0340741dd2af05527ec365" archived-at: Wed, 21 Jun 2017 21:08:10 -0000 --94eb2c0340741dd2af05527ec365 Content-Type: text/plain; charset="UTF-8" Hi Riyafa, I think you should use the terms *feature* and *geometry* to avoid confusion like the following example. CREATE TYPE AnyObject AS {}; CREATE TYPE *FeatureType* AS { `type`: string, the_geom: *geometry*, properties: AnyObject }; The internal 'geometry' attribute is what holds the shape geometry while the outer FeatureType associates additional attributes and properties to that geometry. As we discussed in our last call, the first step is to parse GeoJSON as a regular JSON file and then parse the geometry using a UDF. In this case, you might have something like: CREATE TYPE *FeatureType* AS { `type`: string, the_geom: *AnyObject*, properties: AnyObject }; This will allow you to parse the file without modifying the existing JSON parser. Then, you can define a UDF called "ParseGeoJSON" which takes as input the "the_geom" attribute and returns a parsed geometry attribute. The next step should avoid this additional step and should automatically detect and parse the geometry attribute directly from GeoJJSON. Thanks Ahmed On Wed, Jun 21, 2017 at 11:40 AM, Yingyi Bu wrote: > >> type appears to be keyword > `type` would make it valid. > > >> We can't use the defining type within the same type recursively (ie. > GeometryType within GeometryType) > > We don't support recursive type definition. > > >> The type object cannot be resolved > > We don't have a builtin name for a completely open type, but you can > define one. > > > What you can do is: > > CREATE TYPE AnyObject AS {}; > > CREATE TYPE GeometryType AS { > `type`: string, > geometry: SomeType; > properties: AnyObject > } > > > Best, > Yingyi > > On Wed, Jun 21, 2017 at 6:33 AM, Mike Carey wrote: > > > One approach would be to be silent about properties - and then it could > be > > there anyway - however, that wouldn't allow you to state the requirement > > (?) that it must be called properties and/or that it must be an object > (not > > a scalar). That could work for now, perhaps? We need to have an "any > > record type" type name - we've noted a desire for that - unfortunately we > > don't have one I don't think. I believe the concept is there inside the > > code, in the type-related areas, but we don't have a keyword like name > for > > it. (@Yingyi - comments?) And we do also have a restriction (at the > type > > level) that precludes recursion (regular or mutual) in type definitions; > we > > probably need to do something about that someday as well. > > > > In the meantime, these things could be handled (weakly) by documenting > > what's expected/allowed in this setting. > > > > Cheers, > > > > Mike > > > > PS - I wonder if JSON Schema has the expressiveness for this? > > > > On 6/21/17 2:26 AM, Riyafa Abdul Hameed wrote: > > > > Hi, > > > > I would like to parse the following or any GeoJSON type[1] to a record in > > AsterixDB: > > { > > "type":"Feature", > > "geometry":{ > > "type":"Point", > > "coordinates":[ > > -118.40, > > 33.93 > > ] > > }, > > "properties":{ > > "code":"LAX", > > "elevation":38 > > } > > } > > > > The value of properties is optional and is a variable that is it can be > > any type of object. What is the most suitable datatype to use to > represent > > properties in this case? > > Is something like the following possible? > > > > CREATE TYPE GeometryType AS { > > type: string, > > geometry: GeometryType, > > properties: object > > }; > > > > I came up with the above because there's a derived type called objects[2] > > in AsterixDB. The above doesn't work because of the following reasons: > > > > - type appears to be keyword > > - We can't use the defining type within the same type recursively (ie. > > GeometryType within GeometryType) > > - The type object cannot be resolved > > > > Any suggestions on how a GeoJSON can be parsed into AsterixDB? > > > > [1] https://tools.ietf.org/html/rfc7946 > > [2] https://ci.apache.org/projects/asterixdb/datamodel.html#Deri > > vedTypesObject > > > > Thank you > > Yours sincerely, > > Riyafa > > > > > > > --94eb2c0340741dd2af05527ec365 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Riyafa,

I think you should use the t= erms feature and geometry to avoid confusion like the followi= ng example.

CREAT= E TYPE AnyObject AS {};

CREATE TYPE FeatureType AS {
=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0`type`: string,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0the_geom: geometry,=
=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0properties: AnyObject
};

<= div>The internal 'geometry' attrib= ute is what holds the shape geometry while the outer FeatureType associates= additional attributes and properties to that geometry.

As we discussed in our last call, the first step is to parse Geo= JSON as a regular JSON file and then parse the geometry using a UDF. In thi= s case, you might have something like:

C= REATE TYPE=C2=A0FeatureType=C2=A0AS {
=C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0`type`: string,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0the_geom:=C2=A0AnyObject,<= /span>
=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0properties: AnyObject
};

<= /div>
This will allow you to parse the= file without modifying the existing JSON parser. Then, you can define a UD= F called "ParseGeoJSON" which takes as input the "the_geom&q= uot; attribute and returns a parsed geometry attribute.
The next step should avoid this additional s= tep and should automatically detect and parse the geometry attribute direct= ly from GeoJJSON.

Thanks
Ahm= ed=C2=A0

On Wed, Jun 21, 2017 at 11:40 AM, Yingyi Bu = <buyingyi@gmail.com> wrote:
>> type appears to be keyword
`type` would make it valid.

>> We can't use the defining type within the same type recursivel= y (ie.
GeometryType within GeometryType)

We don't support recursive type definition.

>> The type object cannot be resolved

=C2=A0We don't have a builtin name for a completely open type, b= ut you can
define one.


What you can do is:

CREATE TYPE AnyObject AS {};

CREATE TYPE GeometryType AS {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0`type`: string,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0geometry: SomeType;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0properties: AnyObject
}


Best,
Yingyi

On Wed, Jun 21, 2017 at 6:33 AM, Mike Carey <dtabass@gmail.com> wrote:

> One approach would be to be silent about properties - and then it coul= d be
> there anyway - however, that wouldn't allow you to state the requi= rement
> (?) that it must be called properties and/or that it must be an object= (not
> a scalar).=C2=A0 That could work for now, perhaps?=C2=A0 We need to ha= ve an "any
> record type" type name - we've noted a desire for that - unfo= rtunately we
> don't have one I don't think.=C2=A0 I believe the concept is t= here inside the
> code, in the type-related areas, but we don't have a keyword like = name for
> it.=C2=A0 (@Yingyi - comments?)=C2=A0 And we do also have a restrictio= n (at the type
> level) that precludes recursion (regular or mutual) in type definition= s; we
> probably need to do something about that someday as well.
>
> In the meantime, these things could be handled (weakly) by documenting=
> what's expected/allowed in this setting.
>
> Cheers,
>
> Mike
>
> PS - I wonder if JSON Schema has the expressiveness for this?
>
> On 6/21/17 2:26 AM, Riyafa Abdul Hameed wrote:
>
> Hi,
>
> I would like to parse the following or any GeoJSON type[1] to a record= in
> AsterixDB:
> {
>=C2=A0 =C2=A0 "type":"Feature",
>=C2=A0 =C2=A0 "geometry":{
>=C2=A0 =C2=A0 =C2=A0 =C2=A0"type":"Point",
>=C2=A0 =C2=A0 =C2=A0 =C2=A0"coordinates":[
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 -118.40,
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 33.93
>=C2=A0 =C2=A0 =C2=A0 =C2=A0]
>=C2=A0 =C2=A0 },
>=C2=A0 =C2=A0 "properties":{
>=C2=A0 =C2=A0 =C2=A0 =C2=A0"code":"LAX",
>=C2=A0 =C2=A0 =C2=A0 =C2=A0"elevation":38
>=C2=A0 =C2=A0 }
> }
>
> The value of properties is optional and is a variable that is it can b= e
> any type of object. What is the most suitable datatype to use to repre= sent
> properties in this case?
> Is something like the following possible?
>
> CREATE TYPE GeometryType AS {
>=C2=A0 =C2=A0 =C2=A0 type: string,
>=C2=A0 =C2=A0 =C2=A0 geometry: GeometryType,
>=C2=A0 =C2=A0 =C2=A0 properties: object
> };
>
> I came up with the above because there's a derived type called obj= ects[2]
> in AsterixDB. The above doesn't work because of the following reas= ons:
>
>=C2=A0 =C2=A0 - type appears to be keyword
>=C2=A0 =C2=A0 - We can't use the defining type within the same type= recursively (ie.
>=C2=A0 =C2=A0 GeometryType within GeometryType)
>=C2=A0 =C2=A0 - The type object cannot be resolved
>
> Any suggestions on how a GeoJSON can be parsed into AsterixDB?
>
> [1] https://tools.ietf.org/html/rfc7946
> [2] https://ci.apache.org/projects<= wbr>/asterixdb/datamodel.html#Deri
> vedTypesObject
>
> Thank you
> Yours sincerely,
> Riyafa
>
>
>

--94eb2c0340741dd2af05527ec365--