pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vivek Padmanabhan (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-2363) Provide configuration to disable persistence of _logs for streaming commands
Date Tue, 15 Nov 2011 12:04:53 GMT

    [ https://issues.apache.org/jira/browse/PIG-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150416#comment-13150416

Vivek Padmanabhan commented on PIG-2363:

As I understand in Pig 0.8, the logging happens only when the user explicitly specifies the
logging option in the define clause. 
DEFINE mycmd `t.pl` stderr('mylogs' limit 100);
This is defined in QueryParser.jjt, in DefineClause;
<INPUT> "(" InputOutputSpec(command, StreamingCommand.Handle.INPUT) ")"
           <OUTPUT> "(" InputOutputSpec(command, StreamingCommand.Handle.OUTPUT) ")"
           <ERROR> "(" ErrorSpec(command, t.image) ")"

But in Pig 0.9, with the parser changes, the LogicalPlanBuilder.buildCommand always sets the
logging to true.
This is because from the LogicalPlanGenerator.g the log directory is is set to the alias name
if none specified.
($error_clause.dir == null? $alias : $error_clause.dir)

So this change looks intentional, if not may be as Olga mentioned we should change this rather
than adding a config.
> Provide configuration to disable persistence of _logs for streaming commands
> ----------------------------------------------------------------------------
>                 Key: PIG-2363
>                 URL: https://issues.apache.org/jira/browse/PIG-2363
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.9.1
>            Reporter: Vivek Padmanabhan
>         Attachments: PIG-2363_1.patch
> For Pig scripts which has streaming commands , the stderr is saved into hdfs under _logs
folder in the output directory.
> This behavior was not seen with Pig 0.8 by default, but from 0.9 onwards,  we are seeing
_logs folder.
> Hence it would be nice to have a configuration to disable this feature.
> Sample script
> {code}
> DEFINE mycmd `t.pl` ship ('t.pl');
> a = load 'i1' as (f1:chararray,f2:chararray);
> b = stream a through mycmd;
> store b into 'output';
> {code}

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message