hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mridul Muralidharan <mrid...@yahoo-inc.com>
Subject Re: [jira] Updated: (PIG-690) UNION doesn't work in the latest code
Date Wed, 04 Mar 2009 07:13:13 GMT

Great, thanks !
I am assuming this might also fix load related schema issues too (with 
BinStorage) ? Looked kind of similar issue as I reported in pig usergroup.

- Mridul

Pradeep Kamath (JIRA) wrote:
>      [ https://issues.apache.org/jira/browse/PIG-690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
> 
> Pradeep Kamath updated PIG-690:
> -------------------------------
> 
>         Fix Version/s: types_branch
>              Assignee: Pradeep Kamath
>     Affects Version/s: types_branch
>                Status: Patch Available  (was: Open)
> 
> The root cause of the issue is while merging schemas, the code recursively merges subschemas
if a field is a tuple or a bag. At that point, it does not properly attribute the type to
be bag if that was the case. It always marks the type as tuple whenever the field schema is
of type bag or tuple. This is fixed in the patch and a unit test case has been added which
tries to union two relations which have a bag field. 
> 
> 
>> UNION doesn't work in the latest code
>> -------------------------------------
>>
>>                 Key: PIG-690
>>                 URL: https://issues.apache.org/jira/browse/PIG-690
>>             Project: Pig
>>          Issue Type: Bug
>>    Affects Versions: types_branch
>>         Environment: mapred mode. local mode.has the same problem under linux.
>> code is taken from trunk
>>            Reporter: Amir Youssefi
>>            Assignee: Pradeep Kamath
>>             Fix For: types_branch
>>
>>         Attachments: PIG-690.patch
>>
>>
>> grunt> a = load 'tmp/f1' using BinStorage();
>> grunt> b = load 'tmp/f2' using BinStorage();
>> grunt> describe a;
>> a: {int,chararray,int,{(int,chararray,chararray)}}
>> grunt> describe b;
>> b: {int,chararray,int,{(int,chararray,chararray)}}
>> grunt> c = union a,b;
>> grunt> describe c;
>> 2009-02-27 11:51:46,012 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1052:
Cannot cast bag with schema bag({(int,chararray,chararray)}) to tuple with schema tuple
>> Details at logfile: /homes/amiry/pig_1235735380348.log
>> dump a and dump b work fine.
>> Sample data provided to dev team in an e-mail.
> 


Mime
View raw message