asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Maxon <ima...@uci.edu>
Subject Re: Question about open indexes
Date Mon, 28 Sep 2015 19:15:42 GMT
I think the general thought that is unfortunately the fix for this
can't go into 0.8.7 as it is right now, because it needs changes to
Hyracks first. We should probably do a bugfix release right after.

-Ian

On Mon, Sep 28, 2015 at 11:51 AM, Till Westmann <tillw@apache.org> wrote:
> I’ve added the “0.8.7-blocker” to
> https://issues.apache.org/jira/browse/ASTERIXDB-1109 (which I believe covers
> this issue).
> Is this what we agree on right now?
>
> Also, do we already have a review for this?
>
> Thanks,
> Till
>
>
> On 26 Sep 2015, at 1:37, abdullah alamoudi wrote:
>
>> I agree with Chen especially with the system not yet production ready.
>> It seems that going through with the release is more important.
>>
>> Cheers,
>>
>>
>> Amoudi, Abdullah.
>>
>> On Sat, Sep 26, 2015 at 3:33 AM, Chen Li <chenli@gmail.com> wrote:
>>
>>> I vote for including this fix in the next Asterxi/Hyracks release, not
>>> this
>>> one.
>>>
>>> Chen
>>>
>>> On Fri, Sep 25, 2015 at 4:23 PM, Ildar Absalyamov <
>>> ildar.absalyamov@gmail.com> wrote:
>>>
>>>> It did not really occur to me during today during the meeting, but
>>>
>>> Preston
>>>>
>>>> pointed out that the secondary index delete fix, that I proposed, spans
>>>> both Hyracks & Asterix codebase. Thus we will either have to release
>>>> Hyracks once again, or bite the bullet, sign the RC without this fixing
>>>> this issue and create bug-fix releases for both Hyracks&Asterix right
>>>
>>> after.
>>>>
>>>>
>>>>> On Sep 22, 2015, at 22:27, Mike Carey <dtabass@gmail.com> wrote:
>>>>>
>>>>> Ah - that makes sense now.  Thx.  (And welcome back. :-))
>>>>>
>>>>> On 9/22/15 10:02 PM, Ildar Absalyamov wrote:
>>>>>>
>>>>>> Sorry for confusion, my initial answer was not correct enough,
>>>
>>> probably
>>>>
>>>> should have waited sometime after I drove 1500 miles form Seattle :)
>>>>>>
>>>>>> The casting in the insert pipeline, which Abdullah mentioned, is
>>>
>>> needed
>>>>
>>>> only for secondary index insert. The reasoning behind this casting is to
>>>> ensure that the record is equivalent, thus it is safe to create an open
>>>> index. It is true that we can get <Pk, Sk> pairs out of original record
>>>> using get-field-by-name\index, but the cast operator is introduced
>>>> merely
>>>> to kill the pipeline if the dataset input is not correct.
>>>>>>
>>>>>> Thus the records in primary are never touched of modified, not matter
>>>>
>>>> what indexes were created.
>>>>>>
>>>>>> I am not sure however what is the second cast in Abdullah’s plan,
and
>>>>
>>>> where is comes from.
>>>>>>
>>>>>>
>>>>>> @Taewoo, so scan-delete-btree-secondary-index-open test does not
>>>>
>>>> actually delete data from the secondary index? I have checked the plan
>>>
>>> and
>>>>
>>>> it has the delete operator. Maybe it is initialized with wrong
>>>
>>> parameters,
>>>>
>>>> I’ll have a close look.
>>>>>>
>>>>>>
>>>>>>> On Sep 22, 2015, at 18:33, Mike Carey <dtabass@gmail.com>
wrote:
>>>>>>>
>>>>>>> Sounds kinda bad!  Also, I wonder what happens when the compiler
>>>>
>>>> encounters records in the dataset - whose type in the catalog doesn't
>>>
>>> claim
>>>>
>>>> to have a given (but now indexed) open field - e.g., during a data scan
>>>
>>> or
>>>>
>>>> an access via some other path?  Can Bad Things Happen due to the
>>>> compiler
>>>> not properly anticipating the casted form of the records?  (Maybe I am
>>>> misunderstanding something, but we should probably take a careful look
>>>> at
>>>> the test cases - and make sure we do things like add a bunch of records,
>>>> then add such an index, then add some more records, then stress-test
>>>> type-related things that come at the dataset (i) thru the index, (ii)
>>>
>>> thru
>>>>
>>>> a primary dataset scan, and (iii) thru some other index.)
>>>>>>>
>>>>>>>
>>>>>>> On 9/22/15 4:06 PM, Taewoo Kim wrote:
>>>>>>>>
>>>>>>>> I think this issue:
>>>>
>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1109 is
>>>>>>>>
>>>>>>>> related. Currently, index entries (SK, PK) are not deleted
on an
>>>>
>>>> open-type
>>>>>>>>
>>>>>>>> secondary index during a deletion. This issue was not surfaced
due
>>>
>>> to
>>>>
>>>> the
>>>>>>>>
>>>>>>>> fact that every search after a secondary index search had
to go
>>>>
>>>> through the
>>>>>>>>
>>>>>>>> primary index lookup.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Taewoo
>>>>>>>>
>>>>>>>> On Tue, Sep 22, 2015 at 12:04 AM, Ildar Absalyamov <
>>>>>>>> ildar.absalyamov@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Abdullah,
>>>>>>>>>
>>>>>>>>> If I remember correctly whenever a secondary open index
is created
>>>>
>>>> all
>>>>>>>>>
>>>>>>>>> existing records would be casted to a proper type to
ensure that
>>>
>>> the
>>>>
>>>> index
>>>>>>>>>
>>>>>>>>> creation is valid.
>>>>>>>>> As for the overall correctness of casting operation,
semantically
>>>>
>>>> creating
>>>>>>>>>
>>>>>>>>> an open index is the same thing as altering the dataset
type. The
>>>>
>>>> current
>>>>>>>>>
>>>>>>>>> implementation allows only one open index of particular
type
>>>
>>> created
>>>>
>>>> on a
>>>>>>>>>
>>>>>>>>> single field. If we would have had “alter datatype”
functionality
>>>>
>>>> the open
>>>>>>>>>
>>>>>>>>> indexing would not be required at all.
>>>>>>>>>
>>>>>>>>>> On Sep 21, 2015, at 23:25, abdullah alamoudi <amoudi@apache.org>
>>>>
>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> More thoughts:
>>>>>>>>>> I assume the intention of the cast was just to make
sure if the
>>>
>>> open
>>>>>>>>>
>>>>>>>>> field
>>>>>>>>>>
>>>>>>>>>> exists, it is of the specified type. Moreover, the
un-casted
>>>
>>> record
>>>>>>>>>
>>>>>>>>> should
>>>>>>>>>>
>>>>>>>>>> be inserted into the index.
>>>>>>>>>> If my assumptions are not correct, please, let me
know ASAP.
>>>>>>>>>>
>>>>>>>>>> I have two thoughts on this:
>>>>>>>>>> 1. Actually, insert plans show that the records being
inserted
>>>
>>> into
>>>>
>>>> the
>>>>>>>>>>
>>>>>>>>>> primary index is actually the casted record creating
the issue
>>>>
>>>> described
>>>>>>>>>>
>>>>>>>>>> above.
>>>>>>>>>>
>>>>>>>>>> 2. I don't believe this is the right way to ensure
that the open
>>>>
>>>> field if
>>>>>>>>>>
>>>>>>>>>> exists is of the right type. why not extract the
field using field
>>>>
>>>> access
>>>>>>>>>>
>>>>>>>>>> by name function and then verify the type using the
field tag?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Sep 22, 2015 at 9:11 AM, abdullah alamoudi
<
>>>>
>>>> amoudi@apache.org>
>>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Dev, @Ildar,
>>>>>>>>>>>
>>>>>>>>>>> In the insert pipeline for datasets with open
indexes, we
>>>>
>>>> introduce a
>>>>>>>>>
>>>>>>>>> cast
>>>>>>>>>>>
>>>>>>>>>>> function before the insert and so one would expect
the records to
>>>>
>>>> look
>>>>>>>>>
>>>>>>>>> like
>>>>>>>>>>>
>>>>>>>>>>> the casted record type which I assume has {{the
closed fields + a
>>>>>>>>>
>>>>>>>>> nullable
>>>>>>>>>>>
>>>>>>>>>>> field}}.
>>>>>>>>>>>
>>>>>>>>>>> The question is, what happens to the previously
existing
>>>
>>> records?,
>>>>
>>>> since
>>>>>>>>>>>
>>>>>>>>>>> now the index has both, records of the original
type and records
>>>>
>>>> of the
>>>>>>>>>>>
>>>>>>>>>>> casted type.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Abdullah.
>>>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> Ildar
>>>>>>>>>
>>>>>>>>>
>>>>>> Best regards,
>>>>>> Ildar
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> Best regards,
>>>> Ildar
>>>>
>>>>
>>>
>

Mime
View raw message