pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cheolsoo Park (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PIG-3269) In operator support
Date Mon, 08 Apr 2013 11:35:16 GMT

     [ https://issues.apache.org/jira/browse/PIG-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Cheolsoo Park updated PIG-3269:
-------------------------------

    Attachment: PIG-3269-2.patch

I realized that I don't need a limit on number of operands. Since the return type of my UDF
is always Boolean, I don't have to implement {{getArgToFuncMapping}} unlike case statement.

Updating the patch.
                
> In operator support
> -------------------
>
>                 Key: PIG-3269
>                 URL: https://issues.apache.org/jira/browse/PIG-3269
>             Project: Pig
>          Issue Type: New Feature
>          Components: internal-udfs, parser
>    Affects Versions: 0.11
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>             Fix For: 0.12
>
>         Attachments: PIG-3269-2.patch, PIG-3269.patch
>
>
> This is another language improvement using the same approach as in PIG-3268.
> Currently, Pig has no support for IN operator. To mimic it, users often have to concatenate
several OR operators.
> For example,
> {code}
> a = LOAD '1.txt' USING PigStorage(',') AS (i:int);
> b = FILTER a BY 
>    (i == 1) OR
>    (i == 22) OR
>    (i == 333) OR
>    (i == 4444) OR
>    (i == 55555);
> {code}
> But this can be re-rewritten in a more compact manner using IN operator as follows: 
> {code}
> a = LOAD '1.txt' USING PigStorage(',') AS (i:int);
> b = FILTER a BY i IN (1,22,333,4444,55555);
> {code}
> I propose that we implement IN operator in the following manner:
> * Add built-in UDFs that take expressions as args. Take for example the aforementioned
IN operator, we can define a UDF such as {{builtInUdf(i, 1, 22, 333, 4444, 55555)}}.
> * Add syntactical sugar for these built-in UDFs.
> Similarly to PIG-3268, this approach requires a limit on the number of values. This is
again because  we need to populate the full list of possible args schemas in {{EvalFunc.getArgToFuncMapping}}.
For now, I arbitrarily chose 50, but it can be easily changed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message