pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Dai (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PIG-4950) Fix minor issues with running scripts in non-local FileSystems
Date Sun, 17 Jul 2016 08:27:20 GMT

     [ https://issues.apache.org/jira/browse/PIG-4950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Daniel Dai updated PIG-4950:
       Resolution: Fixed
     Hadoop Flags: Reviewed
    Fix Version/s: 0.16.1
           Status: Resolved  (was: Patch Available)


Patch committed to both 0.16 branch and trunk. Thanks Peter!

> Fix minor issues with running scripts in non-local FileSystems
> --------------------------------------------------------------
>                 Key: PIG-4950
>                 URL: https://issues.apache.org/jira/browse/PIG-4950
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.15.0, 0.16.0
>            Reporter: Peter Slawski
>            Assignee: Peter Slawski
>            Priority: Minor
>             Fix For: 0.17.0, 0.16.1
>         Attachments: PIG-4950.1.patch
> There are two similar minor issues regarding running Pig scripts located in non-local
FileSystems such as hdfs and s3.
> # The first occurs when the script path is passed using the ‘-f’ option. In this
case, the script contents are not set in ScriptState. Instead a WARN message is logged due
to an IOException being thrown. This is because the ‘remote’ path is treated as a local
one. Instead, the path of the downloaded script should be passed over to ScriptState#setScript.
As a result of this bug, an empty string is set for the “pig.script” property when the
Pig job runs on a Hadoop cluster. Also, if Tez is being used, then the Dag info does not include
the script contents as it normally does when a local script is passed.
> # The second issue is more minor, but #validateLogFile in the Main class is set to use
the path given by the user rather than using the downloaded local file path. Again, #validateLogFile
method treats the given path as a local one, but this would not be the case if the user specifies
a remote path. i.e. one with a scheme such as hdfs or s3. This occurs in both cases: when
the script is specified using the ‘-f’ option or when the script is passed as the last/remaining
> Both fixes to these issues are to just pass in the local downloaded path instead. If
the script path specified is a local one, then the local downloaded path would just be that
path specified.

This message was sent by Atlassian JIRA

View raw message