Return-Path: X-Original-To: apmail-asterixdb-dev-archive@minotaur.apache.org Delivered-To: apmail-asterixdb-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8D67B17554 for ; Wed, 24 Jun 2015 22:19:32 +0000 (UTC) Received: (qmail 1507 invoked by uid 500); 24 Jun 2015 22:19:32 -0000 Delivered-To: apmail-asterixdb-dev-archive@asterixdb.apache.org Received: (qmail 1455 invoked by uid 500); 24 Jun 2015 22:19:32 -0000 Mailing-List: contact dev-help@asterixdb.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@asterixdb.incubator.apache.org Delivered-To: mailing list dev@asterixdb.incubator.apache.org Received: (qmail 1443 invoked by uid 99); 24 Jun 2015 22:19:32 -0000 Received: from Unknown (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 Jun 2015 22:19:32 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id BC931C006D for ; Wed, 24 Jun 2015 22:19:31 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.88 X-Spam-Level: *** X-Spam-Status: No, score=3.88 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_REPLY=1, HTML_MESSAGE=3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id Jo6_MDkohThA for ; Wed, 24 Jun 2015 22:19:25 +0000 (UTC) Received: from mail-wi0-f175.google.com (mail-wi0-f175.google.com [209.85.212.175]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id C668545CC3 for ; Wed, 24 Jun 2015 22:19:24 +0000 (UTC) Received: by wicnd19 with SMTP id nd19so1473572wic.1 for ; Wed, 24 Jun 2015 15:19:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=BINmonvdyOFSD46xIZYW03yy5ew+Lx5WcAzYiWc0m8M=; b=Xt3Govwa4KgOoswO1VLXk0z4fYgefsLnPev8zaCor25x1JXL6lOh0zmJDxK8Xyyozv ozN+YBuJtRSSAypDQBjf2hler7fyIIcmN0Ia0J0x4qGTCmMr9ONWfFWHYo5ze2TLQp6q GEhWUmjmb/LJ3OKhM4O0rfsqwbceO3QUVK3cy70w4RB29vCAIA0S1gpVsl9kosdxGRXE TkJw35P7fWDR0mt+XSkadGD1j4qeLXv/K4lI8vL0TUkALlZF2n0LRMtYhbAQ1bECve3/ SiiCSGJqsom6mB1CTS9mUVGUCmbEuPky+sEN/OwwGURIPawcX6Z+Zq4/8xOuidvWm8Z8 qJaw== MIME-Version: 1.0 X-Received: by 10.194.79.225 with SMTP id m1mr66612246wjx.8.1435184363872; Wed, 24 Jun 2015 15:19:23 -0700 (PDT) Received: by 10.27.99.67 with HTTP; Wed, 24 Jun 2015 15:19:23 -0700 (PDT) In-Reply-To: References: <308FDBD8-616C-4649-8824-C150B794D5F6@cs.ucr.edu> <56C5E538-F605-4AC3-B3AB-C05686B8DF44@cs.ucr.edu> Date: Thu, 25 Jun 2015 01:19:23 +0300 Message-ID: Subject: Re: Metadata names generation From: abdullah alamoudi To: dev@asterixdb.incubator.apache.org Content-Type: multipart/alternative; boundary=047d7b10c90332273805194ae69b --047d7b10c90332273805194ae69b Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable I see. Well, as long as there is a systematic way to tell which is which, it is not as bad as I thought. Abdullah. On Thu, Jun 25, 2015 at 1:16 AM, Steven Jacobs wrote: > a clear case is where there is a data type with a field named "a.b" and > another field named "a" which has a nested field named "b". > > This is allowed right now. You would have to access the first as "a.b" an= d > the second as a.b. The quotes basically tell the parser "this is a single > name with whatever characters I want in it." To me it seems fine to > disallow some characters, but back when I had discussions about this with > Vinayak, Mike, and Till, Till was arguing against disallowing characters.= I > can't really remember his reasons now though. > > @Till, what are your thoughts on this? > > Steven > > On Wed, Jun 24, 2015 at 11:56 AM, abdullah alamoudi > wrote: > > > If that's the case, then I think we need to disallow using the "." sinc= e > it > > is used to access nested fields and can definitely cause ambiguity. > > > > a clear case is where there is a data type with a field named "a.b" and > > another field named "a" which has a nested field named "b". > > > > Thoughts? > > > > > > On Wed, Jun 24, 2015 at 9:51 PM, Steven Jacobs wrote= : > > > > > I think there is no completely user-friendly way around this. Basical= ly > > our > > > names allow ALL characters if they are incapsulated in quotes, so the= re > > > isn't a character we can use that doesn't have the potential for > > ambiguity > > > from the user's perspective. This is why I had to change the nested > stuff > > > in indexing to be a list of strings rather than a single string. > > > Steven > > > > > > On Wed, Jun 24, 2015 at 11:43 AM, Chen Li wrote: > > > > > > > In this case, there could be ambiguity in the names. Does it matte= r? > > > > > > > > Chen > > > > > > > > On Wed, Jun 24, 2015 at 11:17 AM, Steven Jacobs > > > wrote: > > > > > > > > > Fieldnames do allow these characters (both of them). > > > > > Steven > > > > > > > > > > On Wed, Jun 24, 2015 at 11:15 AM, Chen Li > wrote: > > > > > > > > > > > I also prefer "." than "_". Also want to confirm that field > names > > > > don't > > > > > > allow these two characters. > > > > > > > > > > > > Chen > > > > > > > > > > > > On Wed, Jun 24, 2015 at 10:52 AM, Steven Jacobs < > sjaco002@ucr.edu> > > > > > wrote: > > > > > > > > > > > > > I second Young-Seek (especially since this is the syntax that > > users > > > > > will > > > > > > > use themselves for nested information in queries). > > > > > > > > > > > > > > Steven > > > > > > > > > > > > > > On Wed, Jun 24, 2015 at 10:40 AM, Young-Seok Kim < > > > kisskys@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > It seems better to use "." instead of "_" since "." is more > > > > intuitive > > > > > > (at > > > > > > > > least to me) than "_". > > > > > > > > For example, the FacebookUserType_address will be > > > > > > > FacebookUserType.address. > > > > > > > > > > > > > > > > Best, > > > > > > > > Young-Seok > > > > > > > > > > > > > > > > On Wed, Jun 24, 2015 at 6:31 AM, Mike Carey < > dtabass@gmail.com > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Much cleaner! Others should weigh in here to help finali= ze > > the > > > > > > > > > conventions.... Thoughts? > > > > > > > > > On Jun 23, 2015 5:31 PM, "Ildar Absalyamov" < > > > iabsa001@cs.ucr.edu > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > So the general solution is that the generated names > should > > > > become > > > > > > > less > > > > > > > > > > verbose (consider previous examples): > > > > > > > > > > 1) Anonymous fields naming scheme will change to > > > outerTypeName > > > > + > > > > > > =E2=80=9C_=E2=80=9D > > > > > > > + > > > > > > > > > > fieldName, i.e. =E2=80=9CField_address_in_FacebookUserT= ype=E2=80=9D is > > > changed > > > > to > > > > > > > > > > =E2=80=9CFacebookUserType_address=E2=80=9D > > > > > > > > > > 2) Anonymous collection item naming scheme stays the > same, > > > i.e. > > > > > > > > > > =E2=80=9CField_employment_in_FacebookUserType_ItemType= =E2=80=9D is > changed > > to > > > > > > > > > > =E2=80=9CFacebookUserType_employment_ItemType=E2=80=9D = (name is changed > > > because > > > > > the > > > > > > > > > > anonymous field employment naming was changed as > described > > > > > earlier) > > > > > > > > > > 3) Union type completely seizes to exist in metadata (i= t > > > stays > > > > in > > > > > > the > > > > > > > > > > object model though), i.e. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > =E2=80=9CType_#1_UnionType_Field_end-date_in_Field_employment_in_Facebook= UserType_ItemType=E2=80=9D > > > > > > > > > > is changed to > > > =E2=80=9CFacebookUserType_employment_ItemType_end-date=E2=80=9D, > > > > > > where > > > > > > > > the > > > > > > > > > > type metadata will have an additional field =E2=80=9COp= tional=E2=80=9D > with > > > > value > > > > > > > > =E2=80=9Ctrue=E2=80=9D. > > > > > > > > > > > > > > > > > > > > > On Jun 19, 2015, at 18:11, Ildar Absalyamov < > > > > > iabsa001@cs.ucr.edu > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > So I have done half of the fix, which is moved name > > > > generation > > > > > > > logic > > > > > > > > > out > > > > > > > > > > of the Metadata node to the client. > > > > > > > > > > > Up to that point nothing in Metadata format was > changed, > > > > which > > > > > > > makes > > > > > > > > me > > > > > > > > > > wonder whether I should proceed with the following > changes. > > > > > > > > > > > > > > > > > > > > > > As it could be seen from the previous email getting r= id > > of > > > > > > > > > > union-inferred name generation would make auto generate= d > > type > > > > > names > > > > > > > > less > > > > > > > > > > scary, but not entirely. > > > > > > > > > > > Having in mind what Mike mentioned earlier today, > should > > we > > > > do > > > > > > > > > something > > > > > > > > > > about other auto generated type name cases? > > > > > > > > > > > > > > > > > > > > > >> On Jun 19, 2015, at 13:01, Ildar Absalyamov < > > > > > > iabsa001@cs.ucr.edu > > > > > > > > > > > wrote: > > > > > > > > > > >> > > > > > > > > > > >> Currently we are generating the names for > > inner\anonymous > > > > > types > > > > > > in > > > > > > > > the > > > > > > > > > > following cases: > > > > > > > > > > >> 1) Anonymous field in the record. > > > > > > > > > > >> AQL Example: > > > > > > > > > > >> create type FacebookUserType as closed { > > > > > > > > > > >> id: int32, > > > > > > > > > > >> name: string, > > > > > > > > > > >> address: { > > > > > > > > > > >> address_line: string, > > > > > > > > > > >> city: string > > > > > > > > > > >> state: string > > > > > > > > > > >> } > > > > > > > > > > >> } > > > > > > > > > > >> The pattern for generating an anonymous field name i= s > > > > > "Field_" + > > > > > > > > > > fieldName + "_in_" + outerTypeName, which translates to > > > > > > > > > > "Field_address_in_FacebookUserType" in the given exampl= e > > > > > > > > > > >> > > > > > > > > > > >> 2) Anonymous collection (ordered\unordered list) ite= m > > > > > > > > > > >> create type FacebookUserType as closed { > > > > > > > > > > >> id: int32, > > > > > > > > > > >> name: string, > > > > > > > > > > >> employment: [{ > > > > > > > > > > >> organization-name: string, > > > > > > > > > > >> start-date: date > > > > > > > > > > >> end-date: date? > > > > > > > > > > >> }] > > > > > > > > > > >> } > > > > > > > > > > >> The pattern for generating an anonymous collection > item > > > name > > > > > is > > > > > > > > > > collectionFieldName+_ItemType", which translates to > > > > > > > > > > "Field_employment_in_FacebookUserType_ItemType" in the > > given > > > > > > example > > > > > > > > > > >> > > > > > > > > > > >> 3) Nullable fields > > > > > > > > > > >> Same example as above could be used (end-date field)= : > > the > > > > > > pattern > > > > > > > > for > > > > > > > > > > generating a nullable field name is "Type_#" + > > > > > > > fieldsNumberInUnoinList > > > > > > > > + > > > > > > > > > > "_UnionType_" + outerTypeName, which translates to > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > =E2=80=9CType_#1_UnionType_Field_end-date_in_Field_employment_in_Facebook= UserType_ItemType" > > > > > > > > > > in the given example. > > > > > > > > > > >> > > > > > > > > > > >> So you can see these auto generated names could stac= k > up > > > > > pretty > > > > > > > fast > > > > > > > > > > and be completely incomprehensible. Just to give you a > > small > > > > > flavor > > > > > > > of > > > > > > > > > > that, here is one of the metadata datasets type > > definitions: > > > > > > > > > > >> > > > > > > > > > > >> open { > > > > > > > > > > >> DataverseName: STRING, > > > > > > > > > > >> DatatypeName: STRING, > > > > > > > > > > >> Derived: UNION(NULL, open { > > > > > > > > > > >> Tag: STRING, > > > > > > > > > > >> IsAnonymous: BOOLEAN, > > > > > > > > > > >> EnumValues: UNION(NULL, [ STRING ]), > > > > > > > > > > >> Record: UNION(NULL, open { > > > > > > > > > > >> IsOpen: BOOLEAN, > > > > > > > > > > >> Fields: [ open { > > > > > > > > > > >> FieldName: STRING, > > > > > > > > > > >> FieldType: STRING > > > > > > > > > > >> } > > > > > > > > > > >> ] > > > > > > > > > > >> } > > > > > > > > > > >> ), > > > > > > > > > > >> Union: UNION(NULL, [ STRING ]), > > > > > > > > > > >> UnorderedList: UNION(NULL, STRING), > > > > > > > > > > >> OrderedList: UNION(NULL, STRING) > > > > > > > > > > >> } > > > > > > > > > > >> ), > > > > > > > > > > >> Timestamp: STRING > > > > > > > > > > >> } > > > > > > > > > > >> > > > > > > > > > > >> And here are couple of fields names, generated for i= t > :) > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Type_#1_UnionType_Field_Record_in_Type_#1_UnionType_Field_Derived_in_Data= typeRecordType > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Field_UnorderedList_in_Type_#1_UnionType_Field_Derived_in_DatatypeRecordT= ype > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Field_Fields_in_Type_#1_UnionType_Field_Record_in_Type_#1_UnionType_Field= _Derived_in_DatatypeRecordType_ItemType > > > > > > > > > > >> > > > > > > > > > > >> Best regards, > > > > > > > > > > >> Ildar > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > Best regards, > > > > > > > > > > > Ildar > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best regards, > > > > > > > > > > Ildar > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > Amoudi, Abdullah. > > > --=20 Amoudi, Abdullah. --047d7b10c90332273805194ae69b--