hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-115) start script for pig
Date Thu, 28 Feb 2008 18:53:51 GMT

    [ https://issues.apache.org/jira/browse/PIG-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573399#action_12573399
] 

Alan Gates commented on PIG-115:
--------------------------------

In tried using the patch and ran into some issues with the new pig bash script.  

First, if I used it from any directory other than the directory it was located in, the paths
got messed up and had duplicate values in them (e.g. PIG_LOG_DIR became /home/gates/src/pig/trunk/bin
/home/gates/src/pig/trunk/bin/../logs with a return between the two values).

Second, when I ran it from the directory the script was in, it failed to start pig:

gates> JAVA_HOME=/usr/java/default ./pig
2008-02-28 10:49:09,225 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
- Connecting to hadoop file system at: localhost:9000
2008-02-28 10:49:09,468 [main] ERROR org.apache.pig.Main - org.apache.pig.backend.executionengine.ExecException:
Failed to create DataStorage

A couple of general notes:

When I write a script like this I like to put in a switch like -secretDebugCmd (or something)
that will print out what the script would exec instead of actually execing it.  It can make
debugging easier, and just help users see what the script is doing underneath.

Command line switches are nicer than environment variables.  It would be nicer to be able
to be able to say pig --logdir=bla instead of PIG_LOG_DIR=bla pig.  I realize this makes the
script more complicated (it has to parse command line options, only pay attention to its own
and weed them out of the command line before passing the command on to java), and it is somewhat
dangerous in that as pig Main changes its command line options they could potentially conflict
with choices made in the script.  But I think many users prefer command line over environment
variables.

On pig.pl, you don't have to move that to bin, just blow it away.  Since it's Yahoo specific,
it doesn't need to exist in the apache distribution at all.

> start script for pig
> --------------------
>
>                 Key: PIG-115
>                 URL: https://issues.apache.org/jira/browse/PIG-115
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Stefan Groschupf
>         Attachments: PIG-115_v_1.patch, PIG-115_v_2.patch
>
>
> The current pig.pl is very y! specific, a generic start script is required that works
for all users.
> Goal of this issue is to collect a list requirements a new script has to fulfill.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message