pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohini Palaniswamy (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PIG-5272) BagToTuple output schema is incorrect
Date Tue, 26 Sep 2017 19:27:02 GMT

     [ https://issues.apache.org/jira/browse/PIG-5272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Rohini Palaniswamy updated PIG-5272:
------------------------------------
    Summary: BagToTuple output schema is incorrect  (was: BagToTuple Output Schema)

> BagToTuple output schema is incorrect
> -------------------------------------
>
>                 Key: PIG-5272
>                 URL: https://issues.apache.org/jira/browse/PIG-5272
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.17.0
>            Reporter: Joshua Juen
>            Priority: Minor
>              Labels: patch
>             Fix For: 0.18.0
>
>         Attachments: BagToTupleSchema.patch
>
>
> The output schema from BagToTuple is nonsensical causing problems using the tuple later
in the same script. 
> For example: Given a bag: { data:chararray }, calling BagToTuple yields the schema: (
data:chararray )
> But, this makes no sense since if the above bag contains: {data1, data2, data3} entries,
the output tuple from BagToTuple will be:
> (data1:chararray, data2:chararray, data3:chararray) != (data:chararray),the declared
output schema from the UDF.
> Unfortunately, the schema of the tuple cannot be known during the initial validation
phase. Thus, I believe the output schema from the UDF should be modified to be type tuple
without the number of fields being fixed to the number of columns in the input bag. 
> Under the current way, the elements in the tuple cannot be accessed in the script after
calling BagToTuple without getting an incompatible type error. We have modified the UDF in
our internal UDF jars to work around the issue. Let me know if this sounds reasonable and
I can generate the patch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message