pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Bush (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (PIG-5110) Removing schema alias and :: coming from parent relation
Date Fri, 22 Dec 2017 14:30:00 GMT

    [ https://issues.apache.org/jira/browse/PIG-5110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16301505#comment-16301505
] 

Michael Bush edited comment on PIG-5110 at 12/22/17 2:29 PM:
-------------------------------------------------------------

Great patch!

I've found a scenario where it doesn't remove the prefix.  See the following:

{code}
set pig.store.schema.disambiguate false

A = load 'data1' as (a, b);
B = load 'data1' as (c, d);
C = join A by a, B by c;

describe C;
{code}

Joins work fine and the prefix is removed.
{noformat}
C: {a: bytearray,b: bytearray,c: bytearray,d: bytearray}
{noformat}

However, when there is a group the prefix is not removed.  Continuing from above:

{code}
D = group C by a;
E = load 'data1' as (e, f);
F = join E by e, D by group;

describe F;
{code}
{noformat}
F: {e: bytearray,f: bytearray,group: bytearray,C: {(A::a: bytearray,A::b: bytearray,B::c:
bytearray,B::d: bytearray)}}
{noformat}

Notice it didn't remove the prefix inside of the grouped data C:

So when I use JsonStorage, it leaves the prefixes which is undesirable.


was (Author: mikebush):
Great patch!

I've found a scenario where it doesn't disambiguate.  See the following:

{code}
set pig.store.schema.disambiguate false

A = load 'data1' as (a, b);
B = load 'data1' as (c, d);
C = join A by a, B by c;

describe C;
{code}

Joins work fine and the prefix is removed.
{noformat}
C: {a: bytearray,b: bytearray,c: bytearray,d: bytearray}
{noformat}

However, when there is a group the prefix is not removed.  Continuing from above:

{code}
D = group C by a;
E = load 'data1' as (e, f);
F = join E by e, D by group;

describe F;
{code}
{noformat}
F: {e: bytearray,f: bytearray,group: bytearray,C: {(A::a: bytearray,A::b: bytearray,B::c:
bytearray,B::d: bytearray)}}
{noformat}

Notice it didn't remove the prefix inside of the grouped data C:

So when I use JsonStorage, it leaves the prefixes which is undesirable.

> Removing schema alias and :: coming from parent relation
> --------------------------------------------------------
>
>                 Key: PIG-5110
>                 URL: https://issues.apache.org/jira/browse/PIG-5110
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Adam Szita
>            Assignee: Adam Szita
>             Fix For: 0.17.0
>
>         Attachments: PIG-5110.0.patch, PIG-5110.1.patch, PIG-5110.2.patch
>
>
> Customers have asked for a feature to get rid of the schema alias prefixes. CROSS, JOIN,
FLATTEN, etc.. prepend the field name with the parent field alias and ::
> I would like to find a way to disable this feature. (The burden of making sure not to
have duplicate aliases - and hence the appropriate FrontendException getting thrown - is on
the user)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message