hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jing Huang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-949) Zebra Bug: splitting map into multiple column group using storage hint causes unexpected behaviour
Date Fri, 11 Sep 2009 21:16:58 GMT

    [ https://issues.apache.org/jira/browse/PIG-949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754353#action_12754353
] 

Jing Huang commented on PIG-949:
--------------------------------

Thanks Alok. 
I am able to reproduce the problem. 
I was only using i/o layer (not pig loader) to test map split. 
This is what I did:
  final static String STR_SCHEMA = "m1:map(string),m2:map(map(int))";
  final static String STR_STORAGE = "[m1#{a}];[m2#{x|y}]; [m1#{b}, m2#{z}];[m1]";
.......create table and insert data ......

load:  String projection = new String("m1#{a}");

I only got null returned. 

============

Without storage hint [m1], everything works fine. , i.e. 
 final static String STR_STORAGE = "[m1#{a}];[m2#{x|y}]; [m1#{b}, m2#{z}]";
 .......create table and insert data ......
load:  String projection = new String("m1#{a}");
I am able to get value m1#{a}. 

Zebra team is working on the fix.



> Zebra Bug: splitting map into multiple column group using storage hint causes unexpected
behaviour
> --------------------------------------------------------------------------------------------------
>
>                 Key: PIG-949
>                 URL: https://issues.apache.org/jira/browse/PIG-949
>             Project: Pig
>          Issue Type: Bug
>         Environment: linux
>            Reporter: Alok Singh
>
> Hi 
>  The storage hint
> specification plays a important part whether the output table is readable or not
> say if we have have the map 'map'.
> One can split the map into a column group using [map#{k1}, map#{k2}...] 
> however the remaining map field will automatically be added to the default group.
> if user try to create a new column group for the remaining fields as follows
> [map#{k1}, map#{k2}, ..][map] i.e create a seperate column group
> the table writer will create the table.
> however, if one tries to load the created table via pig or via map reduce using TableInputFormat
>  
> then the reader  have problem reading the map
> We get the following stack trace
> 09/09/09 00:09:45 INFO mapred.JobClient: Task Id : attempt_200908191538_33939_m_000021_2,
Status : FAILED
> java.io.IOException: getValue() failed: null
>         at org.apache.hadoop.zebra.io.BasicTable$Reader$BTScanner.getValue(BasicTable.java:775)
>         at org.apache.hadoop.zebra.mapred.TableRecordReader.next(TableInputFormat.java:717)
>         at org.apache.hadoop.zebra.mapred.TableRecordReader.next(TableInputFormat.java:651)
>         at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:191)
>         at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:175)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:356)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>         at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Alok

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message