asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Carey <dtab...@gmail.com>
Subject Re: Let one Operator finished the job before another one begin in Hyracks
Date Tue, 11 Oct 2016 16:32:57 GMT
BUT AGAIN:  I think the preferred solution in this case is to do it in 
one job.  Mingda, I would suggest sync'ing up with Wenhai for a Skype 
meeting on how he/Preston have done essentially the very same thing in 
their use cases for parallel sorts and interval joins.  Hyracks has 
everything needed for this, as it turns out, without a multi-job need.


On 10/11/16 9:26 AM, Yingyi Bu wrote:
> You can search the usage of waitForCompletion in the code base, e.g.:
>
> APIFramework.java:
>
> public void executeJobArray(IHyracksClientConnection hcc,
> JobSpecification[] specs, PrintWriter out)
>          throws Exception {
>      for (JobSpecification spec : specs) {
>          spec.setMaxReattempts(0);
>          JobId jobId = hcc.startJob(spec);
>          long startTime = System.currentTimeMillis();
>          hcc.waitForCompletion(jobId);
>          long endTime = System.currentTimeMillis();
>          double duration = (endTime - startTime) / 1000.00;
>          out.println("<pre>Duration: " + duration + " sec</pre>");
>      }
>
> }
>
>
> You start a job and get the job Id, and then you can wait on the job id.
>
>
> Best,
>
> Yingyi
>
>
> On Tue, Oct 11, 2016 at 1:45 AM, 李文海 <lwh@whu.edu.cn> wrote:
>
>> Hi, Mingda.
>>      What you need is quite familiar with what I and Presten have done.
>> Actually, I think we just need a shared
>> object accommodated by joblet or task which should be also driven by a
>> broadcast connector inbetween its input
>> and output operators. We can talk about this by skype if needed.
>> Best, Wenhai
>>
>>
>>> -----原始邮件-----
>>> 发件人: "Mike Carey" <dtabass@gmail.com>
>>> 发送时间: 2016年10月11日 星期二
>>> 收件人: dev@asterixdb.apache.org
>>> 抄送:
>>> 主题: Re: Let one Operator finished the job before another one begin in
>> Hyracks
>>> And both Wenhai and Preston have examples of doing the
>>> fan-in-and-compute/fan-back-out pattern with blocking until the latter
>>> part is done - Wenhai for finding range split points for parallel
>>> sorting and Preston for similar things that arise in interval joins.
>>> Can you guys chime in when you have a chance?  (Preston may be busy from
>>> what I saw on Skype on Friday :-), with congrats being due!)
>>>
>>>
>>> On 10/11/16 12:22 AM, Jianfeng Jia wrote:
>>>> Based on the described example, it seems possible to implement it in
>> one job by using MToNPartitioningConnectorDescriptor.
>>>> You can force that merge-BF-operator only runs in one partition by
>> using PartitionConstraintHelper.addAbsoluteLocationConstraint() function.
>>>>> On Oct 10, 2016, at 11:43 PM, mingda li <limingda1993@gmail.com>
>> wrote:
>>>>> Yeah, that will be easier. But for example, we have N nodes and in
>> each
>>>>> node, it will generate a Bloom Filter(BF) for its own data. We need
>> to send
>>>>> these BFs to one node for constructing a complete BF and then send
>> the BF
>>>>> back to each node. I am not sure we can use multiple stage job for
>> this,
>>>>> because there should be a 1->N and a N->1 connecter among nodes.
If
>> in one
>>>>> job, there may be no way to transfer data among nodes.
>>>>> This is my idea. If this can be implemented by one multiple stage
>> job, that
>>>>> will decrease a lot of my work :-)
>>>>>
>>>>> Bests,
>>>>> Mingda
>>>>>
>>>>> On Mon, Oct 10, 2016 at 8:59 PM, Mike Carey <dtabass@gmail.com>
>> wrote:
>>>>>> Is there a reason for wanting two jobs?  I would think that one
>> multiple
>>>>>> stage job would be preferable.
>>>>>>
>>>>>> On Oct 10, 2016 1:21 PM, "mingda li" <limingda1993@gmail.com>
wrote:
>>>>>>
>>>>>>> Oh, thanks Kim~
>>>>>>>
>>>>>>> On Mon, Oct 10, 2016 at 12:55 PM, Taewoo Kim <wangsaeu@gmail.com>
>> wrote:
>>>>>>>> Forwarded to dev.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Taewoo
>>>>>>>>
>>>>>>>> ---------- Forwarded message ----------
>>>>>>>> From: mingda li <limingda1993@gmail.com>
>>>>>>>> Date: Mon, Oct 10, 2016 at 11:21 AM
>>>>>>>> Subject: Let one Operator finished the job before another
one
>> begin in
>>>>>>>> Hyracks
>>>>>>>> To: users@asterixdb.apache.org
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Now,I am trying to build a Bloom Filter(BF) before join.
The BF is
>>>>>> build
>>>>>>> in
>>>>>>>> each node and sent to one node to combine. I want to set
a stop
>> sign
>>>>>>> there
>>>>>>>> before sending the BF in each node. The stop sign means it
can only
>>>>>> send
>>>>>>>> the BF after it is build.
>>>>>>>> The class HyracksConnection.waitForCompletion may help this.
But
>> I am
>>>>>>> not
>>>>>>>> sure how to use it.
>>>>>>>> Should I build two jobs: hcc.waitForCompletion(jobBuildBF);
>>>>>>>> jobidSendBF=hcc.startJob(); ?
>>>>>>>> Has anyone ever used the HyracksConnection.waitForCompletion?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Mingda
>>>>>>>>
>>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message