pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mridul Muralidharan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-1693) support project-range expression. (was: There needs to be a way in foreach to indicate "and all the rest of the fields" )
Date Tue, 05 Apr 2011 22:01:06 GMT

    [ https://issues.apache.org/jira/browse/PIG-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016145#comment-13016145
] 

Mridul Muralidharan commented on PIG-1693:
------------------------------------------

I am not sure what the comment means - do you mean (in the example above) :
a) $3.. works for an unspecified number of columns when there is no load schema ?
b) or, $3..$MAX is required ? (so we should be schema aware).


Or do you simply mean '..' works when there is no loader schema (which I assumed it would
anyway) without commenting on the actual usecase I refer to above ?

Thanks,
Mridul

> support project-range expression. (was: There needs to be a way in foreach to indicate
"and all the rest of the fields" )
> -------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-1693
>                 URL: https://issues.apache.org/jira/browse/PIG-1693
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Thejas M Nair
>             Fix For: 0.9.0
>
>         Attachments: PIG-1693.1.patch, PIG-1693.2.patch
>
>
> A common use case we see in Pig is people have many columns in their data and they only
want to operate on a few of them.  Consider for example if before storing data with ten columns,
the user wants to perform a cast on one column:
> {code}
> ...
> Z = foreach Y generate (int)firstcol, secondcol, thridcol, forthcol, fifthcol, sixthcol,
seventhcol, eigthcol, ninethcol, tenthcol;
> store Z into 'output';
> {code}
> Obviously this only gets worse as the user has more columns.  Ideally the above could
be transformed to something like:
> {code}
> ...
> Z = foreach Y generate (int)firstcol, "and all the rest";
> store Z into 'output'
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message