hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Updated: (PIG-556) FindQuantiles function does not report progress
Date Fri, 05 Dec 2008 01:10:44 GMT

     [ https://issues.apache.org/jira/browse/PIG-556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alan Gates updated PIG-556:
---------------------------

    Attachment: PIG-556.patch

The attached patch addresses several issues.  It addresses the lack of progress() calls in
FindQuantiles.  It also fixes two other problems:

# On order bys the sampling job was running the default number of reduces even though all
data was sent to one reduce.  This was fixed to only run one reduce for that job.
# The progress reporter being passed to EvalFuncs is null.  I believe this is because it is
not getting properly serialized and sent across from front end to back end.  I changed it
so that the progress reporter is set each time the EvalFunc is called.  I also set set EvalFunc.setReporter
to be final so that these set calls can hopefully be inlined by the java optimizer.

> FindQuantiles function does not report progress
> -----------------------------------------------
>
>                 Key: PIG-556
>                 URL: https://issues.apache.org/jira/browse/PIG-556
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>             Fix For: types_branch
>
>         Attachments: PIG-556.patch
>
>
> FindQuantiles does not call progress.  Most of the time this isn't an issue as it finishes
quickly.  But for very large (multi-terabyte) order bys, it does not finish in the required
five minutes.  This causes the whole job to fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message