pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bill Graham (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-2587) Compute LogicalPlan signature and store in job conf
Date Thu, 29 Mar 2012 06:01:53 GMT

    [ https://issues.apache.org/jira/browse/PIG-2587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13241007#comment-13241007

Bill Graham commented on PIG-2587:

I agree if cosmetic changes happen to the script, all bets are off and you'll get a different

Also agree about the 3 items out of scope here. The version of registered jars part would
be ugly due to potential transitive dependancies changing and not being detected. 
> Compute LogicalPlan signature and store in job conf
> ---------------------------------------------------
>                 Key: PIG-2587
>                 URL: https://issues.apache.org/jira/browse/PIG-2587
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Bill Graham
>            Assignee: Bill Graham
>              Labels: 0.10_blocker
>             Fix For: 0.10, 0.11
>         Attachments: pig-2587_1.patch
> We'd like to be able to uniquely identify a re-executed script (possibly with different
inputs/outputs) by creating a signature of the {{LogicalPlan}}. Here's the proposal:
> # Add a new method {{LogicalPlan.getSignature()}} that returns a hash of its {{LogicalPlanPrinter}}
> # In {{PigServer.execute()}} set the signature on the job conf after the LP is compiled,
but before it's executed.
> (1) would allow an impl of {{PigProgressNotificationListener.setScriptPlan()}} to save
the LP signature with the script metadata. Upon subsequent runs (2) would allow an impl of
{{PigReducerEstimator}} (see PIG-2574) to retrieve the current LP signature and fetch the
historical data for the script. It could then use the previous run data to better estimate
the number of reducers.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message