asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Till Westmann" <ti...@apache.org>
Subject Re: Question about open indexes
Date Mon, 28 Sep 2015 18:51:23 GMT
I’ve added the “0.8.7-blocker” to 
https://issues.apache.org/jira/browse/ASTERIXDB-1109 (which I believe 
covers this issue).
Is this what we agree on right now?

Also, do we already have a review for this?

Thanks,
Till

On 26 Sep 2015, at 1:37, abdullah alamoudi wrote:

> I agree with Chen especially with the system not yet production ready.
> It seems that going through with the release is more important.
>
> Cheers,
>
>
> Amoudi, Abdullah.
>
> On Sat, Sep 26, 2015 at 3:33 AM, Chen Li <chenli@gmail.com> wrote:
>
>> I vote for including this fix in the next Asterxi/Hyracks release, 
>> not this
>> one.
>>
>> Chen
>>
>> On Fri, Sep 25, 2015 at 4:23 PM, Ildar Absalyamov <
>> ildar.absalyamov@gmail.com> wrote:
>>
>>> It did not really occur to me during today during the meeting, but
>> Preston
>>> pointed out that the secondary index delete fix, that I proposed, 
>>> spans
>>> both Hyracks & Asterix codebase. Thus we will either have to release
>>> Hyracks once again, or bite the bullet, sign the RC without this 
>>> fixing
>>> this issue and create bug-fix releases for both Hyracks&Asterix 
>>> right
>> after.
>>>
>>>> On Sep 22, 2015, at 22:27, Mike Carey <dtabass@gmail.com> wrote:
>>>>
>>>> Ah - that makes sense now.  Thx.  (And welcome back. :-))
>>>>
>>>> On 9/22/15 10:02 PM, Ildar Absalyamov wrote:
>>>>> Sorry for confusion, my initial answer was not correct enough,
>> probably
>>> should have waited sometime after I drove 1500 miles form Seattle :)
>>>>> The casting in the insert pipeline, which Abdullah mentioned, is
>> needed
>>> only for secondary index insert. The reasoning behind this casting 
>>> is to
>>> ensure that the record is equivalent, thus it is safe to create an 
>>> open
>>> index. It is true that we can get <Pk, Sk> pairs out of original 
>>> record
>>> using get-field-by-name\index, but the cast operator is introduced 
>>> merely
>>> to kill the pipeline if the dataset input is not correct.
>>>>> Thus the records in primary are never touched of modified, not 
>>>>> matter
>>> what indexes were created.
>>>>> I am not sure however what is the second cast in Abdullah’s 
>>>>> plan, and
>>> where is comes from.
>>>>>
>>>>> @Taewoo, so scan-delete-btree-secondary-index-open test does not
>>> actually delete data from the secondary index? I have checked the 
>>> plan
>> and
>>> it has the delete operator. Maybe it is initialized with wrong
>> parameters,
>>> I’ll have a close look.
>>>>>
>>>>>> On Sep 22, 2015, at 18:33, Mike Carey <dtabass@gmail.com> wrote:
>>>>>>
>>>>>> Sounds kinda bad!  Also, I wonder what happens when the compiler
>>> encounters records in the dataset - whose type in the catalog 
>>> doesn't
>> claim
>>> to have a given (but now indexed) open field - e.g., during a data 
>>> scan
>> or
>>> an access via some other path?  Can Bad Things Happen due to the 
>>> compiler
>>> not properly anticipating the casted form of the records?  (Maybe I 
>>> am
>>> misunderstanding something, but we should probably take a careful 
>>> look at
>>> the test cases - and make sure we do things like add a bunch of 
>>> records,
>>> then add such an index, then add some more records, then stress-test
>>> type-related things that come at the dataset (i) thru the index, 
>>> (ii)
>> thru
>>> a primary dataset scan, and (iii) thru some other index.)
>>>>>>
>>>>>> On 9/22/15 4:06 PM, Taewoo Kim wrote:
>>>>>>> I think this issue:
>>> https://issues.apache.org/jira/browse/ASTERIXDB-1109 is
>>>>>>> related. Currently, index entries (SK, PK) are not deleted on
an
>>> open-type
>>>>>>> secondary index during a deletion. This issue was not surfaced

>>>>>>> due
>> to
>>> the
>>>>>>> fact that every search after a secondary index search had to
go
>>> through the
>>>>>>> primary index lookup.
>>>>>>>
>>>>>>> Best,
>>>>>>> Taewoo
>>>>>>>
>>>>>>> On Tue, Sep 22, 2015 at 12:04 AM, Ildar Absalyamov <
>>>>>>> ildar.absalyamov@gmail.com> wrote:
>>>>>>>
>>>>>>>> Abdullah,
>>>>>>>>
>>>>>>>> If I remember correctly whenever a secondary open index is

>>>>>>>> created
>>> all
>>>>>>>> existing records would be casted to a proper type to ensure

>>>>>>>> that
>> the
>>> index
>>>>>>>> creation is valid.
>>>>>>>> As for the overall correctness of casting operation, 
>>>>>>>> semantically
>>> creating
>>>>>>>> an open index is the same thing as altering the dataset type.

>>>>>>>> The
>>> current
>>>>>>>> implementation allows only one open index of particular type
>> created
>>> on a
>>>>>>>> single field. If we would have had “alter datatype” 
>>>>>>>> functionality
>>> the open
>>>>>>>> indexing would not be required at all.
>>>>>>>>
>>>>>>>>> On Sep 21, 2015, at 23:25, abdullah alamoudi 
>>>>>>>>> <amoudi@apache.org>
>>> wrote:
>>>>>>>>>
>>>>>>>>> More thoughts:
>>>>>>>>> I assume the intention of the cast was just to make sure
if 
>>>>>>>>> the
>> open
>>>>>>>> field
>>>>>>>>> exists, it is of the specified type. Moreover, the un-casted
>> record
>>>>>>>> should
>>>>>>>>> be inserted into the index.
>>>>>>>>> If my assumptions are not correct, please, let me know
ASAP.
>>>>>>>>>
>>>>>>>>> I have two thoughts on this:
>>>>>>>>> 1. Actually, insert plans show that the records being
inserted
>> into
>>> the
>>>>>>>>> primary index is actually the casted record creating
the issue
>>> described
>>>>>>>>> above.
>>>>>>>>>
>>>>>>>>> 2. I don't believe this is the right way to ensure that
the 
>>>>>>>>> open
>>> field if
>>>>>>>>> exists is of the right type. why not extract the field
using 
>>>>>>>>> field
>>> access
>>>>>>>>> by name function and then verify the type using the field
tag?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Sep 22, 2015 at 9:11 AM, abdullah alamoudi <
>>> amoudi@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Dev, @Ildar,
>>>>>>>>>>
>>>>>>>>>> In the insert pipeline for datasets with open indexes,
we
>>> introduce a
>>>>>>>> cast
>>>>>>>>>> function before the insert and so one would expect
the 
>>>>>>>>>> records to
>>> look
>>>>>>>> like
>>>>>>>>>> the casted record type which I assume has {{the closed
fields 
>>>>>>>>>> + a
>>>>>>>> nullable
>>>>>>>>>> field}}.
>>>>>>>>>>
>>>>>>>>>> The question is, what happens to the previously existing
>> records?,
>>> since
>>>>>>>>>> now the index has both, records of the original type
and 
>>>>>>>>>> records
>>> of the
>>>>>>>>>> casted type.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Abdullah.
>>>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> Ildar
>>>>>>>>
>>>>>>>>
>>>>> Best regards,
>>>>> Ildar
>>>>>
>>>>>
>>>>
>>>
>>> Best regards,
>>> Ildar
>>>
>>>
>>

Mime
View raw message