hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Richard Ding (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1333) API interface to Pig
Date Thu, 20 May 2010 23:34:21 GMT

    [ https://issues.apache.org/jira/browse/PIG-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12869809#action_12869809

Richard Ding commented on PIG-1333:

I propose Pig add a new class PigRunner that has a run method that returns a PigStats object:

package org.apache.pig;

public abstract class PigRunner {
    public static PigStats run(String[] args) {...}

The PigStats class will include the following methods:

boolean isSuccessful() 

int getReturnCode() // a list of return codes will be defined in PigRunner

String getErrorMessage()

int getErrorCode() // PigException's error code

int getNumberJobs() // number of MR jobs for this invocation

JobPlan getJobPlan() // DAG of MR jobs (a.k.a. an OperatorPlan)

List<String> getOutputLocations() 

long getNumberRecords(String location) // number of records in the given output location

long getNumberBytes(String location)  // number of bytes in the given location

... ... // a few more

A job in the JobPlan will include these methods:


String getAlias() // the alias associated with this job

String getFeature()  // the Pig feature associated with this job

int getNumberMaps() 

int getNumberReduces() 

... ... // a few more methods on job statistics retrieved from Hadoop


> API interface to Pig
> --------------------
>                 Key: PIG-1333
>                 URL: https://issues.apache.org/jira/browse/PIG-1333
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Olga Natkovich
>            Assignee: Richard Ding
>             Fix For: 0.8.0
> It would be nice to make Pig more friendly for applications like workflow that would
be executing pig scripts on user behalf.
> Currently, they would have to use pig command line to execute the code; however, this
has limitation on the kind of output that would be delivered. For instance, it is hard to
produce error information that is easy to use programatically or collect statistics.
> The proposal is to create a class that mimics the behavior of the Main but gives users
a status object back. The the main code of pig would look somethig like:
> public static void main(String args[])
> {
>     PigStatus ps = PigMain.exec(args);
>     exit (PigStatus.rc);
> }
> We need to define the following:
> - Content of PigStatus. It should at least include
>    * return code
>    * error string
>    * exception 
>    * statistics
> - A way to propagate the status class through pig code

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message