hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Ciemiewicz (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-505) Lineage for UDFs that do not return bytearray
Date Wed, 29 Oct 2008 00:16:44 GMT

    [ https://issues.apache.org/jira/browse/PIG-505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12643382#action_12643382

David Ciemiewicz commented on PIG-505:

I don't understand this statement:

If the map contains a DataByteArray (bytearray) then we will not be able to convert it to
any of the Pig types. For DataByteArray, Pig does not have a mechanism to interpret the bytes.
Only non DataByteArray types can be converted to Pig types as long they can be converted,
i.e., int to float, int to long, etc. 

The Pig 1.4 to 2.0 Transition document (1.3 Cast) says:
from / to	bag	tuple	map	int	long	float	double	chararray	bytearray
bytearray	 yes	 yes	 yes	 yes	 yes	 yes	 yes	 yes	  

> Lineage for UDFs that do not return bytearray
> ---------------------------------------------
>                 Key: PIG-505
>                 URL: https://issues.apache.org/jira/browse/PIG-505
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Santhosh Srinivasan
>            Assignee: Santhosh Srinivasan
>             Fix For: types_branch
> In Pig-335, the lineage design states that UDFs that return bytearrays could cause problems
in tracing the lineage. For UDFs that do not return bytearray, the lineage design should pickup
the right load function to use as long as there is no ambiguity.  In the current implementation,
we could have issues with scripts like:
> {code}
> a = load 'input' as (field1);
> b = foreach a generate myudf_to_double(field1);
> c =  foreach b generate $0 + 2.0;
> {code}
> When $0 has to be cast to a double, the lineage code will complain that it hit a UDF
and hence cannot determine the right load function to use.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message