asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Carey <dtab...@gmail.com>
Subject Re: Nested type + open-enforced-index question.
Date Fri, 14 Jul 2017 06:38:19 GMT
Note that indexes can ONLY be associated with datasets - so it would 
seem (w/o looking at the code :-)) that maybe the required info could be 
hung at the top level in (path, type) form as an extension of the 
dataset's top-level record type.  E.g., given something like

     CREATE DATASET ChirpMessages(ChirpMessageType) PRIMARY KEY chirpId;

     CREATE INDEX ScreenNameIndex ON ChirpMessages(user.screenName: 
string?) TYPE BTREE ENFORCED;

then you would return a dataset type descriptor that says that 
ChirpMessages has all the info of type ChirpMessageType augmented with 
an additional typed path user.screenName of data type string.

Just a thought (probably half-baked :-))...

Cheers,

Mike


On 7/13/17 11:10 PM, Taewoo Kim wrote:
> @Yingyi: thanks.
>
> @Mike: Yeah. My problem is how to associate the field type information.
> Ideally, the leaf level has the field to type hash map and the parent of it
> has that hashmap in its record type. And its parent needs to have the
> necessary information to reach to this record type. If we don't need any
> pre-defined type at each level to create a multi-level enforced index, then
> things will become more complex to me. :-) Anyway, we can discuss further
> to finalize the field type propagation implementation.
>
> Best,
> Taewoo
>
> On Thu, Jul 13, 2017 at 11:02 PM, Mike Carey <dtabass@gmail.com> wrote:
>
>> Taewoo,
>>
>> To clarify further what should work:
>>   - We should support nested indexes that go down multiple levels.
>>   - We should (ideally) support their use in index-NL joins.
>>
>> Reflecting on our earlier conversation(s), I think I can see why you're
>> asking this. :-) The augmented type information that'll be needed to do
>> this completely/properly will actually have to associate types with field
>> paths (not just with fields by name) - which is a slightly more complicated
>> association.
>>
>> Cheers,
>> Mike
>>
>>
>> On 7/13/17 10:54 PM, Yingyi Bu wrote:
>>
>>> Hi Taewoo,
>>>
>>> The first query shouldn't fail because indexnl is just a hint.
>>> The second query should succeed because it's a valid indexing statement.
>>> High nesting levels in open record like JSON is not uncommon.
>>>
>>> Best,
>>> Yingyi
>>>
>>>
>>> On Thu, Jul 13, 2017 at 10:51 PM, Taewoo Kim <wangsaeu@gmail.com> wrote:
>>>
>>> @Mike: In order to properly deal with the enforced index on a nested-type
>>>> field, I need to make sure that whether my understanding (each nested
>>>> type
>>>> (except the leaf level0 has a record type for the next level) is correct
>>>> or
>>>> not. Which one is a bug? The first one (without index) should fail? Or
>>>> the
>>>> second one (with an index) should succeed?
>>>>
>>>> Best,
>>>> Taewoo
>>>>
>>>> On Thu, Jul 13, 2017 at 9:58 PM, Yingyi Bu <buyingyi@gmail.com> wrote:
>>>>
>>>> Indeed, it's a bug!
>>>>> Best,
>>>>> Yingyi
>>>>>
>>>>> On Thu, Jul 13, 2017 at 9:52 PM, Mike Carey <dtabass@gmail.com>
wrote:
>>>>>
>>>>> Sounds like a bug to me.
>>>>>>
>>>>>>
>>>>>> On 7/13/17 7:59 PM, Taewoo Kim wrote:
>>>>>>
>>>>>> Currently, I am working on a field type propagation without using
>>>>>>> initializing the OptimizableSubTree in the current index access
>>>>>>>
>>>>>> method.
>>>>> I
>>>>>
>>>>>> am encountering an issue with an open-type enforced index. So, I
just
>>>>>> want
>>>>>> to make sure that my understanding is correct. It looks like we can't
>>>>>> have
>>>>>> an enforced-index on a completely schemaless nested field. For
>>>>>> example,
>>>>> the
>>>>>>> following doesn't generate any issue.
>>>>>>>
>>>>>>> //
>>>>>>> create type DBLPType as open {id: int32}
>>>>>>> create type CSXType as closed {id: int32}
>>>>>>>
>>>>>>> create dataset DBLP(DBLPType) primary key id;
>>>>>>> create dataset CSX(CSXType) primary key id;
>>>>>>>
>>>>>>> for $a in dataset('DBLP')
>>>>>>> for $b in dataset('CSX')
>>>>>>> where $a.nested.one.title /*+ indexnl */ = $b.nested.one.title
>>>>>>> return {"arec": $a, "brec": $b}
>>>>>>> //
>>>>>>>
>>>>>>> However, the following generates an exception. So, can we assume
that
>>>>>>>
>>>>>> to
>>>>> create an enforced-index, except the leaf level, there should be a
>>>>>> defined
>>>>>> record type. For example, for this example, there should be "nested"
>>>>>> type
>>>>>> and "one" type.
>>>>>>> //
>>>>>>> create type DBLPType as open {id: int32}
>>>>>>> create type CSXType as closed {id: int32}
>>>>>>>
>>>>>>> create dataset DBLP(DBLPType) primary key id;
>>>>>>> create dataset CSX(CSXType) primary key id;
>>>>>>>
>>>>>>> create index title_index_DBLP on DBLP(nested.one.title: string?)
>>>>>>>
>>>>>> enforced;
>>>>>> create index title_index_CSX on CSX(nested.one.title: string?)
>>>>>> enforced;
>>>>> for $a in dataset('DBLP')
>>>>>>> for $b in dataset('CSX')
>>>>>>> where $a.nested.one.title /*+ indexnl */ = $b.nested.one.title
>>>>>>> return {"arec": $a, "brec": $b}
>>>>>>> //
>>>>>>>
>>>>>>> Best,
>>>>>>> Taewoo
>>>>>>>
>>>>>>>
>>>>>>>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message