hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thejas M Nair (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1461) support union operation that merges based on column names
Date Wed, 28 Jul 2010 18:58:22 GMT

    [ https://issues.apache.org/jira/browse/PIG-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12893307#action_12893307
] 

Thejas M Nair commented on PIG-1461:
------------------------------------

The pseudo column containing the source relation, proposed in the first comment seems unnecessary.
If user requires the source information to be available, they can project that in an additional
foreach before the union. 



> support union operation that merges based on column names
> ---------------------------------------------------------
>
>                 Key: PIG-1461
>                 URL: https://issues.apache.org/jira/browse/PIG-1461
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>
> When the data has schema, it often makes sense to union on column names in schema rather
than the position of the columns. 
> The behavior of existing union operator should remain backward compatible .
> This feature can be supported using either a new operator or extending union to support
'using' clause . I am thinking of having a new operator called either unionschema or merge
. Does anybody have any other suggestions for the syntax ?
> example -
> L1 = load 'x' as (a,b);
> L2 = load 'y' as (b,c);
> U = unionschema L1, L2;
> describe U;
> U: {a:bytearray, b:byetarray, c:bytearray}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message