pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gang Luo <lgpub...@yahoo.com.cn>
Subject Re: the last job in the mapreduce plan
Date Wed, 16 Jun 2010 14:20:25 GMT
Thanks for replying. Actually, I didn't observe such thing happen in pig now. But one of the
operators I implement in Pig requires to end the current MR operator afterwards. That issue
may happen in my case. 


----- 原始邮件 ----
发件人: Ashutosh Chauhan <ashutosh.chauhan@gmail.com>
收件人: pig-dev@hadoop.apache.org
发送日期: 2010/6/15 (周二) 1:24:46 下午
主   题: Re: the last job in the mapreduce plan


What you are saying can never happen because we create a new MR
operator only when we have a blocking operator which needs to go in
the next MR operator. We dont create new MR operator apriori without
looking at next physical operator in the pipeline. If you are seeing
this happening, I would consider that as a bug.


On Tue, Jun 15, 2010 at 09:26, Alan Gates <gates@yahoo-inc.com> wrote:
> I've never seen a case where this happens. 營s this a theoretical question
> or are you seeing this issue?
> Alan.
> On Jun 15, 2010, at 8:49 AM, Gang Luo wrote:
>> Hi,
>> Is it possible the last MapReduce job in the MR plan only loads something
>> and stores it without any other processing in between? For example, when
>> visiting some physical operator, we need to end the current MR operator
>> after embedding the physical operator into MR operator, and create a new MR
>> operator for later physical operators. Unfortunately, the following physical
>> operator is a store, the end of the entire query. In this case, the last MR
>> operator only contain load and store without any meaningful work in between.
>> This idle MapReduce job will degrade the performance. Will this happen in
>> Pig?
>> Thanks,
>> -Gang


View raw message