asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen Li <che...@gmail.com>
Subject Re: Question about open indexes
Date Sat, 26 Sep 2015 00:33:56 GMT
I vote for including this fix in the next Asterxi/Hyracks release, not this
one.

Chen

On Fri, Sep 25, 2015 at 4:23 PM, Ildar Absalyamov <
ildar.absalyamov@gmail.com> wrote:

> It did not really occur to me during today during the meeting, but Preston
> pointed out that the secondary index delete fix, that I proposed, spans
> both Hyracks & Asterix codebase. Thus we will either have to release
> Hyracks once again, or bite the bullet, sign the RC without this fixing
> this issue and create bug-fix releases for both Hyracks&Asterix right after.
>
> > On Sep 22, 2015, at 22:27, Mike Carey <dtabass@gmail.com> wrote:
> >
> > Ah - that makes sense now.  Thx.  (And welcome back. :-))
> >
> > On 9/22/15 10:02 PM, Ildar Absalyamov wrote:
> >> Sorry for confusion, my initial answer was not correct enough, probably
> should have waited sometime after I drove 1500 miles form Seattle :)
> >> The casting in the insert pipeline, which Abdullah mentioned, is needed
> only for secondary index insert. The reasoning behind this casting is to
> ensure that the record is equivalent, thus it is safe to create an open
> index. It is true that we can get <Pk, Sk> pairs out of original record
> using get-field-by-name\index, but the cast operator is introduced merely
> to kill the pipeline if the dataset input is not correct.
> >> Thus the records in primary are never touched of modified, not matter
> what indexes were created.
> >> I am not sure however what is the second cast in Abdullah’s plan, and
> where is comes from.
> >>
> >> @Taewoo, so scan-delete-btree-secondary-index-open test does not
> actually delete data from the secondary index? I have checked the plan and
> it has the delete operator. Maybe it is initialized with wrong parameters,
> I’ll have a close look.
> >>
> >>> On Sep 22, 2015, at 18:33, Mike Carey <dtabass@gmail.com> wrote:
> >>>
> >>> Sounds kinda bad!  Also, I wonder what happens when the compiler
> encounters records in the dataset - whose type in the catalog doesn't claim
> to have a given (but now indexed) open field - e.g., during a data scan or
> an access via some other path?  Can Bad Things Happen due to the compiler
> not properly anticipating the casted form of the records?  (Maybe I am
> misunderstanding something, but we should probably take a careful look at
> the test cases - and make sure we do things like add a bunch of records,
> then add such an index, then add some more records, then stress-test
> type-related things that come at the dataset (i) thru the index, (ii) thru
> a primary dataset scan, and (iii) thru some other index.)
> >>>
> >>> On 9/22/15 4:06 PM, Taewoo Kim wrote:
> >>>> I think this issue:
> https://issues.apache.org/jira/browse/ASTERIXDB-1109 is
> >>>> related. Currently, index entries (SK, PK) are not deleted on an
> open-type
> >>>> secondary index during a deletion. This issue was not surfaced due to
> the
> >>>> fact that every search after a secondary index search had to go
> through the
> >>>> primary index lookup.
> >>>>
> >>>> Best,
> >>>> Taewoo
> >>>>
> >>>> On Tue, Sep 22, 2015 at 12:04 AM, Ildar Absalyamov <
> >>>> ildar.absalyamov@gmail.com> wrote:
> >>>>
> >>>>> Abdullah,
> >>>>>
> >>>>> If I remember correctly whenever a secondary open index is created
> all
> >>>>> existing records would be casted to a proper type to ensure that
the
> index
> >>>>> creation is valid.
> >>>>> As for the overall correctness of casting operation, semantically
> creating
> >>>>> an open index is the same thing as altering the dataset type. The
> current
> >>>>> implementation allows only one open index of particular type created
> on a
> >>>>> single field. If we would have had “alter datatype” functionality
> the open
> >>>>> indexing would not be required at all.
> >>>>>
> >>>>>> On Sep 21, 2015, at 23:25, abdullah alamoudi <amoudi@apache.org>
> wrote:
> >>>>>>
> >>>>>> More thoughts:
> >>>>>> I assume the intention of the cast was just to make sure if
the open
> >>>>> field
> >>>>>> exists, it is of the specified type. Moreover, the un-casted
record
> >>>>> should
> >>>>>> be inserted into the index.
> >>>>>> If my assumptions are not correct, please, let me know ASAP.
> >>>>>>
> >>>>>> I have two thoughts on this:
> >>>>>> 1. Actually, insert plans show that the records being inserted
into
> the
> >>>>>> primary index is actually the casted record creating the issue
> described
> >>>>>> above.
> >>>>>>
> >>>>>> 2. I don't believe this is the right way to ensure that the
open
> field if
> >>>>>> exists is of the right type. why not extract the field using
field
> access
> >>>>>> by name function and then verify the type using the field tag?
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Tue, Sep 22, 2015 at 9:11 AM, abdullah alamoudi <
> amoudi@apache.org>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Hi Dev, @Ildar,
> >>>>>>>
> >>>>>>> In the insert pipeline for datasets with open indexes, we
> introduce a
> >>>>> cast
> >>>>>>> function before the insert and so one would expect the records
to
> look
> >>>>> like
> >>>>>>> the casted record type which I assume has {{the closed fields
+ a
> >>>>> nullable
> >>>>>>> field}}.
> >>>>>>>
> >>>>>>> The question is, what happens to the previously existing
records?,
> since
> >>>>>>> now the index has both, records of the original type and
records
> of the
> >>>>>>> casted type.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Abdullah.
> >>>>>>>
> >>>>> Best regards,
> >>>>> Ildar
> >>>>>
> >>>>>
> >> Best regards,
> >> Ildar
> >>
> >>
> >
>
> Best regards,
> Ildar
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message