hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Santhosh Srinivasan (JIRA)" <j...@apache.org>
Subject [jira] Created: (PIG-664) Semantics of * is not consistent
Date Tue, 10 Feb 2009 19:17:59 GMT
Semantics of * is not consistent
--------------------------------

                 Key: PIG-664
                 URL: https://issues.apache.org/jira/browse/PIG-664
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: types_branch
            Reporter: Santhosh Srinivasan
            Assignee: Santhosh Srinivasan
             Fix For: types_branch


The semantics of * is not consistent in PIG. The use of * with generate results in the all
the columns of the record being flattened. However, the use of * as an input to a UDF results
in a tuple (wrapped in another tuple). For consistency, * should always result in all the
columns of the record (i.e., flattened). The use of * occurs in:

1. Foreach generate: E.g.: foreach input generate *;
2. Input to UDFs: E.g. foreach input generate myUDF(*);
3. Order by: E.g.: order input by *;
4. (Co)Group: E.g.: group a by *; cogroup a by *, b by *;

In terms of implementation, this involves rolling back the fix introduced in PIG-597 and fixing
the following builtin UDFs:

1. ARITY - Should return the size of the input tuple instead of extracting the first column
of the input tuple
2. SIZE - Should return the size of the input tuple instead of extracting the first column
of the input tuple

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message