asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Carey <dtab...@gmail.com>
Subject Re: Cannot load an index that is not empty [TreeIndexException]
Date Sat, 20 Feb 2016 23:04:31 GMT
Sounds like the load job parallelism needs a redo - it probably shouldn't
be more than the number of target partitions IMO...?
On Feb 20, 2016 12:41 PM, "abdullah alamoudi" <bamousaa@gmail.com> wrote:

> I have an idea that might explain why such a strange behavior happened. I
> believe it could be due to the number of task partitions being very high
> assuming each of the 76 files is being read in a separate task.
> This could potentially lead to some corner cases that we didn't consider
> before considering the number of threads in the tasks thread pool is less
> than 76, some tasks will not be able to start until others have completed
> execution.
>
> Just a thought,
> Abdullah.
>
> On Fri, Feb 19, 2016 at 9:43 PM, abdullah alamoudi <bamousaa@gmail.com>
> wrote:
>
> > Yiran,
> > Here is one problem causing a failure:
> > edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
> > edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
> >
> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
> > Input stream given to BTree bulk load has duplicates.
> >
> > which tells us that Input stream given to BTree bulk load has duplicates.
> > The question is why this was not returned as the error message? We need
> to
> > look into that.
> >
> > I will continue looking at the log file to see if there were other
> issues.
> >
> > Can you share with us the load statement you're using? I would like to
> see
> > how you're loading all the files. we might be able to suggest a way to
> make
> > it work better.
> >
> > Cheers,
> > Abdullah.
> >
> > On Fri, Feb 19, 2016 at 9:31 PM, Yiran Wang <wyr4137@gmail.com> wrote:
> >
> >> Abdullah,
> >>
> >> Here is the log attached. Thank you all very much for looking into this.
> >>
> >> Ian - I have two query questions besides this loading issue. I was
> >> wondering if I can meet briefly with you (or over email) regarding that.
> >>
> >> Thanks!
> >> Yiran
> >>
> >> On Fri, Feb 19, 2016 at 9:38 AM, Mike Carey <dtabass@gmail.com> wrote:
> >>
> >>> Maybe Ian can visit the cluster with Yiran later today?
> >>> On Feb 19, 2016 1:31 AM, "abdullah alamoudi" <bamousaa@gmail.com>
> wrote:
> >>>
> >>>> Yiran,
> >>>> Can you share the logs? It would help us identifying the actual cause
> >>>> of this failure much faster.
> >>>>
> >>>> I am pretty sure you know this but in case you didn't, you can get the
> >>>> logs using
> >>>> >managix log -n <instance-name>
> >>>>
> >>>> Also, it would be nice if someone from the team has access to the
> >>>> cluster so we can work with it directly.
> >>>> Cheers,
> >>>> Abdullah.
> >>>>
> >>>>
> >>>> On Fri, Feb 19, 2016 at 9:40 AM, Yiran Wang <wyr4137@gmail.com>
> wrote:
> >>>>
> >>>>> Steven,
> >>>>>
> >>>>> Thanks for getting back to me so quickly! I wasn't clear. Here is
> what
> >>>>> happened:
> >>>>>
> >>>>> I test-loaded the first 32 files, no problem. I deleted the dataset,
> >>>>> created a new one, and tried to load the entire 76 files into the
> newly
> >>>>> created (hence empty) dataset.
> >>>>>
> >>>>> It took about 2mins after executing the query for the error message
> to
> >>>>> show up. There are currently 31710406 rows of data in the dataset,
> despite
> >>>>> the error message (so it looks like it did load).
> >>>>>
> >>>>> So my questions are: 1) why did I still get that error message when
I
> >>>>> was loading to an empty dataset; and 2) I'm not sure if all the
data
> from
> >>>>> the 76 file are fully loaded. Is there other ways to check, besides
> trying
> >>>>> to load it again and hope this time I don't get the error?
> >>>>>
> >>>>> Thanks!
> >>>>> Yiran
> >>>>>
> >>>>> On Thu, Feb 18, 2016 at 10:29 PM, Steven Jacobs <sjaco002@ucr.edu>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi,
> >>>>>> Welcome! We are an Apache incubator project now so I added the
> >>>>>> correct mailing list. Our "load" statement only works on an
empty
> dataset.
> >>>>>> Subsequent data needs to be added with an insert or a feed.
You
> should be
> >>>>>> able to load all 76 files at once though (starting from empty).
> >>>>>> Steven
> >>>>>>
> >>>>>>
> >>>>>> On Thursday, February 18, 2016, Yiran Wang <wyr4137@gmail.com>
> wrote:
> >>>>>>
> >>>>>>> Hi Asterix team!
> >>>>>>>
> >>>>>>> I've come across this error when I was trying to load 76
files into
> >>>>>>> a dataset. When I test-loaded the first 32 files, there
wasn't
> such an
> >>>>>>> error. All 76 files are of the same data format.
> >>>>>>>
> >>>>>>> Can you help interpret what this error message means?
> >>>>>>>
> >>>>>>> Thanks!
> >>>>>>> Yiran
> >>>>>>>
> >>>>>>> --
> >>>>>>> Best,
> >>>>>>> Yiran
> >>>>>>>
> >>>>>>> --
> >>>>>>> You received this message because you are subscribed to
the Google
> >>>>>>> Groups "asterixdb-dev" group.
> >>>>>>> To unsubscribe from this group and stop receiving emails
from it,
> >>>>>>> send an email to asterixdb-dev+unsubscribe@googlegroups.com.
> >>>>>>> For more options, visit https://groups.google.com/d/optout.
> >>>>>>>
> >>>>>> --
> >>>>>> You received this message because you are subscribed to the
Google
> >>>>>> Groups "asterixdb-users" group.
> >>>>>> To unsubscribe from this group and stop receiving emails from
it,
> >>>>>> send an email to asterixdb-users+unsubscribe@googlegroups.com.
> >>>>>> For more options, visit https://groups.google.com/d/optout.
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Best,
> >>>>> Yiran
> >>>>>
> >>>>> --
> >>>>> You received this message because you are subscribed to the Google
> >>>>> Groups "asterixdb-dev" group.
> >>>>> To unsubscribe from this group and stop receiving emails from it,
> send
> >>>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
> >>>>> For more options, visit https://groups.google.com/d/optout.
> >>>>>
> >>>>
> >>>> --
> >>>> You received this message because you are subscribed to the Google
> >>>> Groups "asterixdb-dev" group.
> >>>> To unsubscribe from this group and stop receiving emails from it, send
> >>>> an email to asterixdb-dev+unsubscribe@googlegroups.com.
> >>>> For more options, visit https://groups.google.com/d/optout.
> >>>>
> >>> --
> >>> You received this message because you are subscribed to the Google
> >>> Groups "asterixdb-users" group.
> >>> To unsubscribe from this group and stop receiving emails from it, send
> >>> an email to asterixdb-users+unsubscribe@googlegroups.com.
> >>> For more options, visit https://groups.google.com/d/optout.
> >>>
> >>
> >>
> >>
> >> --
> >> Best,
> >> Yiran
> >>
> >> --
> >> You received this message because you are subscribed to the Google
> Groups
> >> "asterixdb-dev" group.
> >> To unsubscribe from this group and stop receiving emails from it, send
> an
> >> email to asterixdb-dev+unsubscribe@googlegroups.com.
> >> For more options, visit https://groups.google.com/d/optout.
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message