asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Carey <dtab...@gmail.com>
Subject Re: [jira] [Updated] (ASTERIXDB-1085) Sporadic failures in Feed related tests
Date Wed, 30 Sep 2015 22:32:34 GMT
Approach number two seems right, ie, synchronize the steps so that the
input is ready first....!
On Sep 30, 2015 2:39 PM, "Steven Jacobs" <sjaco002@ucr.edu> wrote:

> I think the problem with doing a single job (as mentioned) is that the
> intake job will exist for many connection jobs, meaning that there is a
> single intake job for feed, and a connection job for each connection to a
> dataset.
> Steven
>
> On Wed, Sep 30, 2015 at 12:05 PM, abdullah alamoudi <bamousaa@gmail.com>
> wrote:
>
> > So I might have an idea about what could cause this.
> > Following are some information about how feeds work. Please, correct me
> if
> > I am wrong as I am just starting to dive deep into this.
> >
> > -- Creating and Dropping feeds are just Metadata operations.
> > -- When you connect a primary feed to a dataset, this is what happens:
> > 1. Feed event subscriber is created for the feed and registered with feed
> > lifecycle listener(Singleton running on master).
> > 2. A feed intake job is constructed that consists of just the feed intake
> > operator and a sink operator. When this job starts, it sits in memory
> doing
> > nothing because it has no subscribers yet.
> > 3. Once the job [2] is submitted, the listener in [1] gets notified and
> > construct an adm command that creates a Hyracks job which has a feed
> > collect operator that gets records from the running intake job[2] and
> feeds
> > it into the dataset.
> > 4. There is no synchronization between [2] and [3] and there is a chance
> > that [3] starts before [2] is ready and that it doesn't find the intake
> > runtime and throws an exception. I know the chance is slim but it is
> there
> > (It has happened to me).
> > 5. At that time, the intake job will never return since it is just
> setting
> > in memory.
> >
> > I am not sure about this but I am guessing that the larger the cluster,
> the
> > higher the chance that one runs into this.
> >
> > The question I have is: Since at the connect statement, we already know
> > everything about the dataset that will be fed into by the feed, why don't
> > we construct a single job that has two roots (the sink and the commit)?
> > Another option would be to make sure that the intake is ready in all
> nodes
> > before the subscription is submitted.
> >
> > Does any of this make sense?
> >
> >
> > Amoudi, Abdullah.
> >
> > On Mon, Sep 14, 2015 at 8:23 PM, Till Westmann (JIRA) <jira@apache.org>
> > wrote:
> >
> > >
> > >      [
> > >
> >
> https://issues.apache.org/jira/browse/ASTERIXDB-1085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
> > > ]
> > >
> > > Till Westmann updated ASTERIXDB-1085:
> > > -------------------------------------
> > >     Assignee: Abdullah Alamoudi
> > >
> > > > Sporadic failures in Feed related tests
> > > > ---------------------------------------
> > > >
> > > >                 Key: ASTERIXDB-1085
> > > >                 URL:
> > > https://issues.apache.org/jira/browse/ASTERIXDB-1085
> > > >             Project: Apache AsterixDB
> > > >          Issue Type: Bug
> > > >          Components: AsterixDB, Feeds
> > > >            Reporter: Abdullah Alamoudi
> > > >            Assignee: Abdullah Alamoudi
> > > >
> > > > Sporadically, test cases which use Feeds (Not necessarily in the feed
> > > test group) fail. There are no exception thrown but records which are
> > > supposed to be in the dataset are not. and subsequent queries return
> > empty
> > > results.
> > >
> > >
> > >
> > > --
> > > This message was sent by Atlassian JIRA
> > > (v6.3.4#6332)
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message