hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitriy V. Ryaboy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-953) Enable merge join in pig to work with loaders and store functions which can internally index sorted data
Date Thu, 10 Sep 2009 21:18:57 GMT

    [ https://issues.apache.org/jira/browse/PIG-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753823#action_12753823
] 

Dmitriy V. Ryaboy commented on PIG-953:
---------------------------------------

Pradeep -- I would argue that my rewrite is not more terse, but less verbose :-)  
I just don't see how
{code}
if (boolean) {
  return true;
} else {
  return false;
}
{code}

is ever better or more readable than

{code}
return boolean;
{code}

As far as oddness, I would argue that since the person who wrote the function introduced a
possible NPE a few lines lower, someone else can hardly be expected to use it properly :-).
 See my comment about checkEquality being false.  The function as written misses a corner
case.


> Enable merge join in pig to work with loaders and store functions which can internally
index sorted data 
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-953
>                 URL: https://issues.apache.org/jira/browse/PIG-953
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.3.0
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>         Attachments: PIG-953.patch
>
>
> Currently merge join implementation in pig includes construction of an index on sorted
data and use of that index to seek into the "right input" to efficiently perform the join
operation. Some loaders (notably the zebra loader) internally implement an index on sorted
data and can perform this seek efficiently using their index. So the use of the index needs
to be abstracted in such a way that when the loader supports indexing, pig uses it (indirectly
through the loader) and does not construct an index. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message