pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gianmarco De Francisci Morales <g...@apache.org>
Subject Re: e2e tests for Rank function
Date Wed, 26 Sep 2012 13:42:15 GMT
I was able to reproduce the bug, I opened PIG-2932 to track it.

Cheers,
--
Gianmarco



On Wed, Sep 26, 2012 at 12:07 PM, Gianmarco De Francisci Morales <
gdfm@apache.org> wrote:

> Forwarding to pig-dev.
>
> Summary, it looks like we have a regression in trunk.
> We need to investigate it before branching 0.11
>
> Cheers,
> --
> Gianmarco
>
>
>
> ---------- Forwarded message ----------
> From: Allan <aavendan@gmail.com>
> Date: Wed, Sep 26, 2012 at 11:21 AM
> Subject: Re: e2e tests for Rank function
> To: cheolsoo <cheolsoo@cloudera.com>, Gianmarco De Francisci Morales <
> gdfm@apache.org>
>
>
> Hi Cheolsoo and Gianmarco,
>
> I double check the e2e tests, and I reproduced the scenario and it's
> correct...it's failing.
>
> Then, looking for a possible reason, I tried the following script:
>
> SET default_parallel 9;
> A = LOAD 'prerank' using PigStorage(',') as
> (rownumber:long,rankcabd:long,rankbdaa:long,rankbdca:long,rankaacd:long,rankaaba:long,a:int,b:int,c:int,tail:bytearray);
> B = group A by (a, b);
> C = foreach B generate flatten(group),A;
> D = order C by group::a ASC, group::b ASC;
>
>
> And it fails, with the same exception' message.
>
> Then, I tried the same script, but omitting the "SET default_parallel 9;"
> and it works. So, I'm really surprised that on local mode it doesn't work
> with parallelism.
>
> The reason for using this script is because RANK (RANK BY) operator uses
> the same chain of operators: GROUP (B), a flatten (C), SORT (D).
>
> Best regards,
>
> On Sun, Sep 23, 2012 at 10:43 PM, Cheolsoo Park <cheolsoo@cloudera.com>wrote:
>
>> Hello,
>>
>> The e2e tests for Rank function in trunk do not pass for me when running
>> in
>> local mode. I am wondering whether they all pass for everyone.
>>
>> What I am doing is as following:
>>
>> ant clean
>>  ant -Dhadoopversion=20 ... test-e2e-deploy-local
>> ant -Dhadoopversion=20 ... test-e2e-local -Dtests.to.run="-t Rank"
>>
>> All tests except Rank_4 fail with errors similar to this:
>>
>> java.io.IOException: Illegal partition for Null: false index: 0 (1,7) (1)
>>     at
>>
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1073)
>>     at
>>
>> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691)
>>     at
>>
>> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>>     at
>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Map.collect(PigGenericMapReduce.java:123)
>>     at
>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:285)
>>     at
>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
>>     at
>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
>>     at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>     at
>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
>>
>> I wanted to double check whether I am doing something wrong before I open
>> a
>> jira.
>>
>> Thanks,
>> Cheolsoo
>>
>
>
>
> --
>
> Allan AvendaƱo S.
> Computer Engineer
> SWY22 Participant
> GSOC 2012 Participant
> Rome - Italy
> Gmail: aavendan@gmail.com
> --
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message