asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mingda li <limingda1...@gmail.com>
Subject Re: Re: Let one Operator finished the job before another one begin in Hyracks
Date Tue, 11 Oct 2016 17:52:11 GMT
Oh, thanks for all the explanation:-)
I will talk with Wenhai about how they implement such function and try to
finish this in one job.

Bests,
Mingda

On Tue, Oct 11, 2016 at 9:52 AM, 李文海 <lwh@whu.edu.cn> wrote:

>
>
>
> > -----原始邮件-----
> > 发件人: "Yingyi Bu" <buyingyi@gmail.com>
> > 发送时间: 2016年10月12日 星期三
> > 收件人: dev@asterixdb.apache.org
> > 抄送:
> > 主题: Re: Let one Operator finished the job before another one begin in
> Hyracks
> >
> > +1!
> >
> > Best,
> > Yingyi
> >
> > On Tue, Oct 11, 2016 at 9:32 AM, Mike Carey <dtabass@gmail.com> wrote:
> >
> > > BUT AGAIN:  I think the preferred solution in this case is to do it in
> one
> > > job.  Mingda, I would suggest sync'ing up with Wenhai for a Skype
> meeting
> > > on how he/Preston have done essentially the very same thing in their
> use
> > > cases for parallel sorts and interval joins.  Hyracks has everything
> needed
> > > for this, as it turns out, without a multi-job need.
> > >
> > >
> > >
> > > On 10/11/16 9:26 AM, Yingyi Bu wrote:
> > >
> > >> You can search the usage of waitForCompletion in the code base, e.g.:
> > >>
> > >> APIFramework.java:
> > >>
> > >> public void executeJobArray(IHyracksClientConnection hcc,
> > >> JobSpecification[] specs, PrintWriter out)
> > >>          throws Exception {
> > >>      for (JobSpecification spec : specs) {
> > >>          spec.setMaxReattempts(0);
> > >>          JobId jobId = hcc.startJob(spec);
> > >>          long startTime = System.currentTimeMillis();
> > >>          hcc.waitForCompletion(jobId);
> > >>          long endTime = System.currentTimeMillis();
> > >>          double duration = (endTime - startTime) / 1000.00;
> > >>          out.println("<pre>Duration: " + duration + " sec</pre>");
> > >>      }
> > >>
> > >> }
> > >>
> > >>
> > >> You start a job and get the job Id, and then you can wait on the job
> id.
> > >>
> > >>
> > >> Best,
> > >>
> > >> Yingyi
> > >>
> > >>
> > >> On Tue, Oct 11, 2016 at 1:45 AM, 李文海 <lwh@whu.edu.cn> wrote:
> > >>
> > >> Hi, Mingda.
> > >>>      What you need is quite familiar with what I and Presten have
> done.
> > >>> Actually, I think we just need a shared
> > >>> object accommodated by joblet or task which should be also driven by
> a
> > >>> broadcast connector inbetween its input
> > >>> and output operators. We can talk about this by skype if needed.
> > >>> Best, Wenhai
> > >>>
> > >>>
> > >>> -----原始邮件-----
> > >>>> 发件人: "Mike Carey" <dtabass@gmail.com>
> > >>>> 发送时间: 2016年10月11日 星期二
> > >>>> 收件人: dev@asterixdb.apache.org
> > >>>> 抄送:
> > >>>> 主题: Re: Let one Operator finished the job before another one
begin
> in
> > >>>>
> > >>> Hyracks
> > >>>
> > >>>> And both Wenhai and Preston have examples of doing the
> > >>>> fan-in-and-compute/fan-back-out pattern with blocking until the
> latter
> > >>>> part is done - Wenhai for finding range split points for parallel
> > >>>> sorting and Preston for similar things that arise in interval joins.
> > >>>> Can you guys chime in when you have a chance?  (Preston may be
busy
> from
> > >>>> what I saw on Skype on Friday :-), with congrats being due!)
> > >>>>
> > >>>>
> > >>>> On 10/11/16 12:22 AM, Jianfeng Jia wrote:
> > >>>>
> > >>>>> Based on the described example, it seems possible to implement
it
> in
> > >>>>>
> > >>>> one job by using MToNPartitioningConnectorDescriptor.
> > >>>
> > >>>> You can force that merge-BF-operator only runs in one partition
by
> > >>>>>
> > >>>> using PartitionConstraintHelper.addAbsoluteLocationConstraint()
> > >>> function.
> > >>>
> > >>>> On Oct 10, 2016, at 11:43 PM, mingda li <limingda1993@gmail.com>
> > >>>>>>
> > >>>>> wrote:
> > >>>
> > >>>> Yeah, that will be easier. But for example, we have N nodes and
in
> > >>>>>>
> > >>>>> each
> > >>>
> > >>>> node, it will generate a Bloom Filter(BF) for its own data. We
need
> > >>>>>>
> > >>>>> to send
> > >>>
> > >>>> these BFs to one node for constructing a complete BF and then send
> > >>>>>>
> > >>>>> the BF
> > >>>
> > >>>> back to each node. I am not sure we can use multiple stage job
for
> > >>>>>>
> > >>>>> this,
> > >>>
> > >>>> because there should be a 1->N and a N->1 connecter among
nodes. If
> > >>>>>>
> > >>>>> in one
> > >>>
> > >>>> job, there may be no way to transfer data among nodes.
> > >>>>>> This is my idea. If this can be implemented by one multiple
stage
> > >>>>>>
> > >>>>> job, that
> > >>>
> > >>>> will decrease a lot of my work :-)
> > >>>>>>
> > >>>>>> Bests,
> > >>>>>> Mingda
> > >>>>>>
> > >>>>>> On Mon, Oct 10, 2016 at 8:59 PM, Mike Carey <dtabass@gmail.com>
> > >>>>>>
> > >>>>> wrote:
> > >>>
> > >>>> Is there a reason for wanting two jobs?  I would think that one
> > >>>>>>>
> > >>>>>> multiple
> > >>>
> > >>>> stage job would be preferable.
> > >>>>>>>
> > >>>>>>> On Oct 10, 2016 1:21 PM, "mingda li" <limingda1993@gmail.com>
> wrote:
> > >>>>>>>
> > >>>>>>> Oh, thanks Kim~
> > >>>>>>>>
> > >>>>>>>> On Mon, Oct 10, 2016 at 12:55 PM, Taewoo Kim <
> wangsaeu@gmail.com>
> > >>>>>>>>
> > >>>>>>> wrote:
> > >>>
> > >>>> Forwarded to dev.
> > >>>>>>>>>
> > >>>>>>>>> Best,
> > >>>>>>>>> Taewoo
> > >>>>>>>>>
> > >>>>>>>>> ---------- Forwarded message ----------
> > >>>>>>>>> From: mingda li <limingda1993@gmail.com>
> > >>>>>>>>> Date: Mon, Oct 10, 2016 at 11:21 AM
> > >>>>>>>>> Subject: Let one Operator finished the job
before another one
> > >>>>>>>>>
> > >>>>>>>> begin in
> > >>>
> > >>>> Hyracks
> > >>>>>>>>> To: users@asterixdb.apache.org
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Hi,
> > >>>>>>>>>
> > >>>>>>>>> Now,I am trying to build a Bloom Filter(BF)
before join. The
> BF is
> > >>>>>>>>>
> > >>>>>>>> build
> > >>>>>>>
> > >>>>>>>> in
> > >>>>>>>>
> > >>>>>>>>> each node and sent to one node to combine.
I want to set a stop
> > >>>>>>>>>
> > >>>>>>>> sign
> > >>>
> > >>>> there
> > >>>>>>>>
> > >>>>>>>>> before sending the BF in each node. The stop
sign means it can
> only
> > >>>>>>>>>
> > >>>>>>>> send
> > >>>>>>>
> > >>>>>>>> the BF after it is build.
> > >>>>>>>>> The class HyracksConnection.waitForCompletion
may help this.
> But
> > >>>>>>>>>
> > >>>>>>>> I am
> > >>>
> > >>>> not
> > >>>>>>>>
> > >>>>>>>>> sure how to use it.
> > >>>>>>>>> Should I build two jobs: hcc.waitForCompletion(jobBuildBF);
> > >>>>>>>>> jobidSendBF=hcc.startJob(); ?
> > >>>>>>>>> Has anyone ever used the HyracksConnection.waitForCompletion?
> > >>>>>>>>>
> > >>>>>>>>> Thanks,
> > >>>>>>>>> Mingda
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>
> > >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message