asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven Jacobs <sjaco...@ucr.edu>
Subject Re: [jira] [Updated] (ASTERIXDB-1085) Sporadic failures in Feed related tests
Date Wed, 30 Sep 2015 21:39:11 GMT
I think the problem with doing a single job (as mentioned) is that the
intake job will exist for many connection jobs, meaning that there is a
single intake job for feed, and a connection job for each connection to a
dataset.
Steven

On Wed, Sep 30, 2015 at 12:05 PM, abdullah alamoudi <bamousaa@gmail.com>
wrote:

> So I might have an idea about what could cause this.
> Following are some information about how feeds work. Please, correct me if
> I am wrong as I am just starting to dive deep into this.
>
> -- Creating and Dropping feeds are just Metadata operations.
> -- When you connect a primary feed to a dataset, this is what happens:
> 1. Feed event subscriber is created for the feed and registered with feed
> lifecycle listener(Singleton running on master).
> 2. A feed intake job is constructed that consists of just the feed intake
> operator and a sink operator. When this job starts, it sits in memory doing
> nothing because it has no subscribers yet.
> 3. Once the job [2] is submitted, the listener in [1] gets notified and
> construct an adm command that creates a Hyracks job which has a feed
> collect operator that gets records from the running intake job[2] and feeds
> it into the dataset.
> 4. There is no synchronization between [2] and [3] and there is a chance
> that [3] starts before [2] is ready and that it doesn't find the intake
> runtime and throws an exception. I know the chance is slim but it is there
> (It has happened to me).
> 5. At that time, the intake job will never return since it is just setting
> in memory.
>
> I am not sure about this but I am guessing that the larger the cluster, the
> higher the chance that one runs into this.
>
> The question I have is: Since at the connect statement, we already know
> everything about the dataset that will be fed into by the feed, why don't
> we construct a single job that has two roots (the sink and the commit)?
> Another option would be to make sure that the intake is ready in all nodes
> before the subscription is submitted.
>
> Does any of this make sense?
>
>
> Amoudi, Abdullah.
>
> On Mon, Sep 14, 2015 at 8:23 PM, Till Westmann (JIRA) <jira@apache.org>
> wrote:
>
> >
> >      [
> >
> https://issues.apache.org/jira/browse/ASTERIXDB-1085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
> > ]
> >
> > Till Westmann updated ASTERIXDB-1085:
> > -------------------------------------
> >     Assignee: Abdullah Alamoudi
> >
> > > Sporadic failures in Feed related tests
> > > ---------------------------------------
> > >
> > >                 Key: ASTERIXDB-1085
> > >                 URL:
> > https://issues.apache.org/jira/browse/ASTERIXDB-1085
> > >             Project: Apache AsterixDB
> > >          Issue Type: Bug
> > >          Components: AsterixDB, Feeds
> > >            Reporter: Abdullah Alamoudi
> > >            Assignee: Abdullah Alamoudi
> > >
> > > Sporadically, test cases which use Feeds (Not necessarily in the feed
> > test group) fail. There are no exception thrown but records which are
> > supposed to be in the dataset are not. and subsequent queries return
> empty
> > results.
> >
> >
> >
> > --
> > This message was sent by Atlassian JIRA
> > (v6.3.4#6332)
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message