hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Olga Natkovich (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (PIG-812) COUNT(*) does not work
Date Thu, 21 May 2009 01:31:45 GMT

     [ https://issues.apache.org/jira/browse/PIG-812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Olga Natkovich resolved PIG-812.
--------------------------------

    Resolution: Won't Fix

The fact that this worked in earlier code was a bug. Now pig has a consistent implementation
of *.

>From the beginning, Pig chose a different semantics for * than SQL. (it is unfortunate
that we chose to use "*" for this but is something we need to leave in for consistency and
backward compatibility.)

"*" in SQL means a relation while in pig it means a tuple passed to an operator. So in Pig
you can order on the entire row by saying

B = order A by *;

You can also pass the entire row to a UDF by doing myUDF(*).

In this context COUNT(*) makes no sense.

> COUNT(*) does not work 
> -----------------------
>
>                 Key: PIG-812
>                 URL: https://issues.apache.org/jira/browse/PIG-812
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.2.0
>            Reporter: Viraj Bhat
>            Priority: Critical
>             Fix For: 0.2.0
>
>         Attachments: studenttab10k
>
>
> Pig script to count the number of rows in a studenttab10k file which contains 10k records.
> {code}
> studenttab = LOAD 'studenttab10k' AS (name:chararray, age:int,gpa:float);
> X2 = GROUP studenttab ALL;
> describe X2;
> Y2 = FOREACH X2 GENERATE COUNT(*);
> explain Y2;
> DUMP Y2;
> {code}
> returns the following error
> ================================================================
> ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias
Y2
> Details at logfile: /homes/viraj/pig-svn/trunk/pig_1242783700970.log
> ================================================================
> If you look at the log file:
> ================================================================
> Caused by: java.lang.ClassCastException
>         at org.apache.pig.builtin.COUNT$Initial.exec(COUNT.java:76)
>         at org.apache.pig.builtin.COUNT$Initial.exec(COUNT.java:68)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:201)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:235)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:223)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:245)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:236)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:88)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
> ================================================================

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message