hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] Updated: (PIG-272) Failure running complex script with streaming
Date Mon, 23 Jun 2008 06:30:45 GMT

     [ https://issues.apache.org/jira/browse/PIG-272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Arun C Murthy updated PIG-272:

    Attachment: split.pl

Attached fix. The patch ensures we deep-copy the StreamingCommand before optimizing it and
reverts the optimization piece-meal (i.e for input and output separately).

The test cases are quite complex/convoluted and are pretty hard to convert to unit-tests,
which I why I've attached them here and propose we integrate them into our end-to-end tests...

> Failure running complex script with streaming
> ---------------------------------------------
>                 Key: PIG-272
>                 URL: https://issues.apache.org/jira/browse/PIG-272
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Olga Natkovich
>            Assignee: Arun C Murthy
>         Attachments: PIG-272_0_20080621.patch, PIG-272_test.pig, split.pl
> The following script fails (stack is further down):
> define CMD `perl identity.pl`;
> define CMD1 `perl identity.pl`;
> A = load '/user/pig/tests/data/singlefile/studenttab10k' as (name, age, gpa);
> B = stream A through CMD;
> store B into 'B1';
> C = stream B through CMD1;
> D = JOIN B by name, C by name;
> store D into 'D1';
> If I remove the intermediate store, the script works fine. Also if I replace streaming
commands with other operators such as filter and foreach, it works even with the intermediate

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message