asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Till Westmann <ti...@apache.org>
Subject Re: Metadata names generation
Date Thu, 25 Jun 2015 01:27:01 GMT

> On Jun 24, 2015, at 3:16 PM, Steven Jacobs <sjaco002@ucr.edu> wrote:
> 
> a clear case is where there is a data type with a field named "a.b" and
> another field named "a" which has a nested field named "b".
> 
> This is allowed right now. You would have to access the first as "a.b" and
> the second as a.b. The quotes basically tell the parser "this is a single
> name with whatever characters I want in it.”

a.b is mainly a convenient shortcut for “a”.”b"

> To me it seems fine to
> disallow some characters, but back when I had discussions about this with
> Vinayak, Mike, and Till, Till was arguing against disallowing characters. I
> can't really remember his reasons now though.
> 
> @Till, what are your thoughts on this?

All characters are allowed for field names in JSON (http://json.org <http://json.org/>).
So if disallow some characters, we will need to map names that contain them so something else
(or not allow such JSON documents).
It seems that that will get messy and/or painful pretty quickly.

Cheers,
Till

> On Wed, Jun 24, 2015 at 11:56 AM, abdullah alamoudi <bamousaa@gmail.com>
> wrote:
> 
>> If that's the case, then I think we need to disallow using the "." since it
>> is used to access nested fields and can definitely cause ambiguity.
>> 
>> a clear case is where there is a data type with a field named "a.b" and
>> another field named "a" which has a nested field named "b".
>> 
>> Thoughts?
>> 
>> 
>> On Wed, Jun 24, 2015 at 9:51 PM, Steven Jacobs <sjaco002@ucr.edu> wrote:
>> 
>>> I think there is no completely user-friendly way around this. Basically
>> our
>>> names allow ALL characters if they are incapsulated in quotes, so there
>>> isn't a character we can use that doesn't have the potential for
>> ambiguity
>>> from the user's perspective. This is why I had to change the nested stuff
>>> in indexing to be a list of strings rather than a single string.
>>> Steven
>>> 
>>> On Wed, Jun 24, 2015 at 11:43 AM, Chen Li <chenli@gmail.com> wrote:
>>> 
>>>> In this case, there could be ambiguity in the names.  Does it matter?
>>>> 
>>>> Chen
>>>> 
>>>> On Wed, Jun 24, 2015 at 11:17 AM, Steven Jacobs <sjaco002@ucr.edu>
>>> wrote:
>>>> 
>>>>> Fieldnames do allow these characters (both of them).
>>>>> Steven
>>>>> 
>>>>> On Wed, Jun 24, 2015 at 11:15 AM, Chen Li <chenli@gmail.com> wrote:
>>>>> 
>>>>>> I also prefer "." than "_".  Also want to confirm that field names
>>>> don't
>>>>>> allow these two characters.
>>>>>> 
>>>>>> Chen
>>>>>> 
>>>>>> On Wed, Jun 24, 2015 at 10:52 AM, Steven Jacobs <sjaco002@ucr.edu>
>>>>> wrote:
>>>>>> 
>>>>>>> I second Young-Seek (especially since this is the syntax that
>> users
>>>>> will
>>>>>>> use themselves for nested information in queries).
>>>>>>> 
>>>>>>> Steven
>>>>>>> 
>>>>>>> On Wed, Jun 24, 2015 at 10:40 AM, Young-Seok Kim <
>>> kisskys@gmail.com>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> It seems better to use "." instead of "_" since "." is more
>>>> intuitive
>>>>>> (at
>>>>>>>> least to me) than "_".
>>>>>>>> For example, the FacebookUserType_address will be
>>>>>>> FacebookUserType.address.
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Young-Seok
>>>>>>>> 
>>>>>>>> On Wed, Jun 24, 2015 at 6:31 AM, Mike Carey <dtabass@gmail.com
>>> 
>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Much cleaner!  Others should weigh in here to help finalize
>> the
>>>>>>>>> conventions....  Thoughts?
>>>>>>>>> On Jun 23, 2015 5:31 PM, "Ildar Absalyamov" <
>>> iabsa001@cs.ucr.edu
>>>>> 
>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> So the general solution is that the generated names
should
>>>> become
>>>>>>> less
>>>>>>>>>> verbose (consider previous examples):
>>>>>>>>>> 1) Anonymous fields naming scheme will change to
>>> outerTypeName
>>>> +
>>>>>> “_”
>>>>>>> +
>>>>>>>>>> fieldName, i.e. “Field_address_in_FacebookUserType”
is
>>> changed
>>>> to
>>>>>>>>>> “FacebookUserType_address”
>>>>>>>>>> 2) Anonymous collection item naming scheme stays
the same,
>>> i.e.
>>>>>>>>>> “Field_employment_in_FacebookUserType_ItemType”
is changed
>> to
>>>>>>>>>> “FacebookUserType_employment_ItemType” (name
is changed
>>> because
>>>>> the
>>>>>>>>>> anonymous field employment naming was changed as
described
>>>>> earlier)
>>>>>>>>>> 3) Union type completely seizes to exist in metadata
(it
>>> stays
>>>> in
>>>>>> the
>>>>>>>>>> object model though), i.e.
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> “Type_#1_UnionType_Field_end-date_in_Field_employment_in_FacebookUserType_ItemType”
>>>>>>>>>> is changed to
>>> “FacebookUserType_employment_ItemType_end-date”,
>>>>>> where
>>>>>>>> the
>>>>>>>>>> type metadata will have an additional field “Optional”
with
>>>> value
>>>>>>>> “true”.
>>>>>>>>>> 
>>>>>>>>>>> On Jun 19, 2015, at 18:11, Ildar Absalyamov <
>>>>> iabsa001@cs.ucr.edu
>>>>>>> 
>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> So I have done half of the fix, which is moved
name
>>>> generation
>>>>>>> logic
>>>>>>>>> out
>>>>>>>>>> of the Metadata node to the client.
>>>>>>>>>>> Up to that point nothing in Metadata format was
changed,
>>>> which
>>>>>>> makes
>>>>>>>> me
>>>>>>>>>> wonder whether I should proceed with the following
changes.
>>>>>>>>>>> 
>>>>>>>>>>> As it could be seen from the previous email getting
rid
>> of
>>>>>>>>>> union-inferred name generation would make auto generated
>> type
>>>>> names
>>>>>>>> less
>>>>>>>>>> scary, but not entirely.
>>>>>>>>>>> Having in mind what Mike mentioned earlier today,
should
>> we
>>>> do
>>>>>>>>> something
>>>>>>>>>> about other auto generated type name cases?
>>>>>>>>>>> 
>>>>>>>>>>>> On Jun 19, 2015, at 13:01, Ildar Absalyamov
<
>>>>>> iabsa001@cs.ucr.edu
>>>>>>>>>> <mailto:iabsa001@cs.ucr.edu>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> Currently we are generating the names for
>> inner\anonymous
>>>>> types
>>>>>> in
>>>>>>>> the
>>>>>>>>>> following cases:
>>>>>>>>>>>> 1) Anonymous field in the record.
>>>>>>>>>>>> AQL Example:
>>>>>>>>>>>> create type FacebookUserType as closed {
>>>>>>>>>>>>        id: int32,
>>>>>>>>>>>>        name: string,
>>>>>>>>>>>>        address: {
>>>>>>>>>>>>             address_line: string,
>>>>>>>>>>>>             city: string
>>>>>>>>>>>>             state: string
>>>>>>>>>>>>     }
>>>>>>>>>>>>    }
>>>>>>>>>>>> The pattern for generating an anonymous field
name is
>>>>> "Field_" +
>>>>>>>>>> fieldName + "_in_" + outerTypeName, which translates
to
>>>>>>>>>> "Field_address_in_FacebookUserType" in the given
example
>>>>>>>>>>>> 
>>>>>>>>>>>> 2) Anonymous collection (ordered\unordered
list) item
>>>>>>>>>>>> create type FacebookUserType as closed {
>>>>>>>>>>>>        id: int32,
>>>>>>>>>>>>        name: string,
>>>>>>>>>>>>        employment: [{
>>>>>>>>>>>>             organization-name: string,
>>>>>>>>>>>>             start-date: date
>>>>>>>>>>>>             end-date: date?
>>>>>>>>>>>>     }]
>>>>>>>>>>>>    }
>>>>>>>>>>>> The pattern for generating an anonymous collection
item
>>> name
>>>>> is
>>>>>>>>>> collectionFieldName+_ItemType", which translates
to
>>>>>>>>>> "Field_employment_in_FacebookUserType_ItemType" in
the
>> given
>>>>>> example
>>>>>>>>>>>> 
>>>>>>>>>>>> 3) Nullable fields
>>>>>>>>>>>> Same example as above could be used (end-date
field):
>> the
>>>>>> pattern
>>>>>>>> for
>>>>>>>>>> generating a nullable field name is "Type_#" +
>>>>>>> fieldsNumberInUnoinList
>>>>>>>> +
>>>>>>>>>> "_UnionType_" + outerTypeName, which translates to
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> “Type_#1_UnionType_Field_end-date_in_Field_employment_in_FacebookUserType_ItemType"
>>>>>>>>>> in the given example.
>>>>>>>>>>>> 
>>>>>>>>>>>> So you can see these auto generated names
could stack up
>>>>> pretty
>>>>>>> fast
>>>>>>>>>> and be completely incomprehensible. Just to give
you a
>> small
>>>>> flavor
>>>>>>> of
>>>>>>>>>> that, here is one of the metadata datasets type
>> definitions:
>>>>>>>>>>>> 
>>>>>>>>>>>> open {
>>>>>>>>>>>>  DataverseName: STRING,
>>>>>>>>>>>>  DatatypeName: STRING,
>>>>>>>>>>>>  Derived: UNION(NULL, open {
>>>>>>>>>>>>      Tag: STRING,
>>>>>>>>>>>>      IsAnonymous: BOOLEAN,
>>>>>>>>>>>>      EnumValues: UNION(NULL, [ STRING ]),
>>>>>>>>>>>>      Record: UNION(NULL, open {
>>>>>>>>>>>>          IsOpen: BOOLEAN,
>>>>>>>>>>>>          Fields: [ open {
>>>>>>>>>>>>              FieldName: STRING,
>>>>>>>>>>>>              FieldType: STRING
>>>>>>>>>>>>            }
>>>>>>>>>>>>          ]
>>>>>>>>>>>>        }
>>>>>>>>>>>>      ),
>>>>>>>>>>>>      Union: UNION(NULL, [ STRING ]),
>>>>>>>>>>>>      UnorderedList: UNION(NULL, STRING),
>>>>>>>>>>>>      OrderedList: UNION(NULL, STRING)
>>>>>>>>>>>>    }
>>>>>>>>>>>>  ),
>>>>>>>>>>>>  Timestamp: STRING
>>>>>>>>>>>> }
>>>>>>>>>>>> 
>>>>>>>>>>>> And here are couple of fields names, generated
for it :)
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> Type_#1_UnionType_Field_Record_in_Type_#1_UnionType_Field_Derived_in_DatatypeRecordType
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> Field_UnorderedList_in_Type_#1_UnionType_Field_Derived_in_DatatypeRecordType
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> Field_Fields_in_Type_#1_UnionType_Field_Record_in_Type_#1_UnionType_Field_Derived_in_DatatypeRecordType_ItemType
>>>>>>>>>>>> 
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>> Ildar
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Best regards,
>>>>>>>>>>> Ildar
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Best regards,
>>>>>>>>>> Ildar
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
>> 
>> 
>> --
>> Amoudi, Abdullah.
>> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message