hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Santhosh Srinivasan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1016) Reading in map data seems broken
Date Thu, 29 Oct 2009 04:44:59 GMT

    [ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771287#action_12771287
] 

Santhosh Srinivasan commented on PIG-1016:
------------------------------------------

I am summarizing my understanding of the patch that has been submitted by hc busy.

Root cause: PIG-880 changed the value type of maps in PigStorage from native Java types to
DataByteArray. As a result of this change, parsing of complex types as map values was disabled.

Proposed fix: Revert the changes made as part of PIG-880 to interpret map values as Java types.
In addition, change the comparison method to check for the object type and call the appropriate
compareTo method. The latter is required to workaround the fact that the front-end assigns
the value type to be DataByteArray whereas the backend sees the actual type (Integer, Long,
Tuple, DataBag, etc.)

Based on this understanding I have the following review comment(s).

Index: src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigBytesRawComparator.java
===================================================================

Can you explain the checks in the if and the else? Specifically, NullableBytesWritable is
a subclass of PigNullableWritable. As a result, in the if part, the check for both o1 and
o2 not being PigNullableWritable is confusing as nbw1 and nbw2 are cast to NullableBytesWritable
if o1 and o2 are not PigNullableWritable.  

{code}
+        // find bug is complaining about nulls. This check sequence will prevent nulls from
being dereferenced.
+        if(o1!=null && o2!=null){
+    
+            // In case the objects are comparable
+            if((o1 instanceof NullableBytesWritable && o2 instanceof NullableBytesWritable)||
+               !(o1 instanceof PigNullableWritable && o2 instanceof PigNullableWritable)
+                ){
+    
+              NullableBytesWritable nbw1 = (NullableBytesWritable)o1;
+              NullableBytesWritable nbw2 = (NullableBytesWritable)o2;
+      
+              // If either are null, handle differently.
+              if (!nbw1.isNull() && !nbw2.isNull()) {
+                  rc = ((DataByteArray)nbw1.getValueAsPigType()).compareTo((DataByteArray)nbw2.getValueAsPigType());
+              } else {
+                  // For sorting purposes two nulls are equal.
+                  if (nbw1.isNull() && nbw2.isNull()) rc = 0;
+                  else if (nbw1.isNull()) rc = -1;
+                  else rc = 1;
+              }
+            }else{
+              // enter here only if both o1 and o2 are non-NullableByteWritable PigNullableWritable's
+              PigNullableWritable nbw1 = (PigNullableWritable)o1;
+              PigNullableWritable nbw2 = (PigNullableWritable)o2;
+              // If either are null, handle differently.
+              if (!nbw1.isNull() && !nbw2.isNull()) {
+                  rc = nbw1.compareTo(nbw2);
+              } else {
+                  // For sorting purposes two nulls are equal.
+                  if (nbw1.isNull() && nbw2.isNull()) rc = 0;
+                  else if (nbw1.isNull()) rc = -1;
+                  else rc = 1;
+              }
+            }
+        }else{
+          if(o1==null && o2==null){rc=0;}
+          else if(o1==null) {rc=-1;}
+          else{ rc=1; }
{code}

> Reading in map data seems broken
> --------------------------------
>
>                 Key: PIG-1016
>                 URL: https://issues.apache.org/jira/browse/PIG-1016
>             Project: Pig
>          Issue Type: Improvement
>          Components: data
>    Affects Versions: 0.4.0
>            Reporter: hc busy
>             Fix For: 0.5.0
>
>         Attachments: PIG-1016.patch
>
>
> Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because
of a misconfiguration in the parser. Where as in almost all documentation it is stated that
value of the map can be any time.
> I've attached a patch that allows us to read in complex objects as value as documented.
I've done simple verification of loading in maps with tuple/map values and writing them back
out using LOAD and STORE. All seems to work fine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message