pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1341) BinStorage cannot convert DataByteArray to Chararray and results in FIELD_DISCARDED_TYPE_CONVERSION_FAILED
Date Wed, 21 Apr 2010 18:02:53 GMT

    [ https://issues.apache.org/jira/browse/PIG-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859453#action_12859453
] 

Alan Gates commented on PIG-1341:
---------------------------------

I agree with Ashutosh.  We do not want BinStorage tracking data lineage.  In the case where
Pig is using BinStorage (or whatever) for moving data between MR jobs then Pig can figure
out the correct cast function to use and apply it.  For cases such as the one here where users
are storing data using BinStorage and then in a separate Pig Latin script reading it (and
thus loosing the type information) it is the users responsibility to correctly cast the data
before storing it in BinStorage.  As a general case I do not think we can expect load and
store functions to track data lineage across Pig Latin scripts.

I propose we close this as won't fix.

> BinStorage cannot convert DataByteArray to Chararray and results in FIELD_DISCARDED_TYPE_CONVERSION_FAILED
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-1341
>                 URL: https://issues.apache.org/jira/browse/PIG-1341
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Richard Ding
>         Attachments: PIG-1341.patch
>
>
> Script reads in BinStorage data and tries to convert a column which is in DataByteArray
to Chararray. 
> {code}
> raw = load 'sampledata' using BinStorage() as (col1,col2, col3);
> --filter out null columns
> A = filter raw by col1#'bcookie' is not null;
> B = foreach A generate col1#'bcookie'  as reqcolumn;
> describe B;
> --B: {regcolumn: bytearray}
> X = limit B 5;
> dump X;
> B = foreach A generate (chararray)col1#'bcookie'  as convertedcol;
> describe B;
> --B: {convertedcol: chararray}
> X = limit B 5;
> dump X;
> {code}
> The first dump produces:
> (36co9b55onr8s)
> (36co9b55onr8s)
> (36hilul5oo1q1)
> (36hilul5oo1q1)
> (36l4cj15ooa8a)
> The second dump produces:
> ()
> ()
> ()
> ()
> ()
> It also throws an error message: FIELD_DISCARDED_TYPE_CONVERSION_FAILED 5 time(s).
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message