hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (PIG-185) Using cached data does not give me the expected result
Date Fri, 04 Apr 2008 08:51:24 GMT

     [ https://issues.apache.org/jira/browse/PIG-185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Arun C Murthy resolved PIG-185.
-------------------------------

    Resolution: Invalid

Xu, cache doesn't work like ship().

You need to pass cache(<filename>#<linkname>) and then you can use ./linkname
in ur streaming command.

> Using cached data does not give me the expected result
> ------------------------------------------------------
>
>                 Key: PIG-185
>                 URL: https://issues.apache.org/jira/browse/PIG-185
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Xu Zhang
>            Assignee: Arun C Murthy
>
> I was trying to run the following Pig script with the latest Pig stuff. Since essentially
I was streaming 2 identical sets of data, I was expecting the final result which is the count
of the name field to contain all even numbers. However, lots of odd number showed up in the
actual result.
> {code}
> define X `perl -ne 'chomp $_; print "$_\n"' - ./user/pig/tests/data/singlefile/studenttab10k`
cache('/user/pig/tests/data/singlefile/studenttab10k');
> A = load '/user/pig/tests/data/singlefile/studenttab10k';
> B = stream A through X as (name, age, gpa);
> C = group B by name;
> D = foreach C generate COUNT(B.$0);
> store D into 'results_22';
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message