pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thejas M Nair (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-1926) Sample/Limit should take scalar
Date Sat, 04 Jun 2011 01:46:47 GMT

    [ https://issues.apache.org/jira/browse/PIG-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13044164#comment-13044164
] 

Thejas M Nair commented on PIG-1926:
------------------------------------

Comments on the most recent patch -
- The expression in limit should be allowed to refer to columns only in scalar context, ie
pig should give a proper error message when such a statement is created. A  Right now it allows
the statement "lim = limit l $0;", and that results in NPE . You can create a validation Visitor
to check this, that gets called from PigServer.Graph.compile(lp).
- If the expression in limit evaluates to bytearray, it can be implicitly cast to a long.
This can be done in the typechecker.  
- The test files are missing. Maybe you forgot to include test dir in diff?

I still need to review the MRCompiler changes. 

Can you use a different file name for each patch (eg PIG-1926.5.patch), and also refer to
that file name in your comments? That way it is easier to identify the downloaded patch files
and associate them with the comments. 


> Sample/Limit should take scalar
> -------------------------------
>
>                 Key: PIG-1926
>                 URL: https://issues.apache.org/jira/browse/PIG-1926
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Daniel Dai
>            Assignee: Gianmarco De Francisci Morales
>              Labels: gsoc2011
>         Attachments: PIG-1926.patch, PIG-1926.patch, PIG-1926.patch, PIG-1926.patch,
PIG-1926.patch, PIG-1926.patch
>
>
> Currently, Limit, Sample only takes a constant. It would be better we can use a scalar
in the place of constant. Eg:
> {code}
> a = load 'a.txt';
> b = group a all;
> c = foreach b generate COUNT(a) as sum;
> d = order a by $0;
> e = limit d c.sum/100;
> {code}
> This is a candidate project for Google summer of code 2011. More information about the
program can be found at http://wiki.apache.org/pig/GSoc2011

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message