pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Dai (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1341) BinStorage cannot convert DataByteArray to Chararray and results in FIELD_DISCARDED_TYPE_CONVERSION_FAILED
Date Thu, 01 Apr 2010 19:25:27 GMT

    [ https://issues.apache.org/jira/browse/PIG-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852515#action_12852515
] 

Daniel Dai commented on PIG-1341:
---------------------------------

Have a discussion with Alan and Richard, we felt that caster for BinStorage does not make
sense. We don't know how to cast bytearray datatype for BinStorage. In the intermediate storage
case, we will find the original loader, and use lineage for that loader to convert bytearray.
But if user use the BinStorage directly, we have no idea what bytearray means. So the suggestion
is we don't give caster to BinStorage. The implication is that if user want to use BinStorage
as a temporary store, in some cases, it will fail. 

Here is a sample script which will be broken if we make this change:

script 1:
{code}
a = load '1.txt';
b = order a by $0;
store b into 'temp.out' using BinStorage(); -- store in BinStorage format with the datatype
bytearray
{code}

script 2:
{code}
a = load 'temp.out' using BinStorage();
b = foreach a generate $0+$1;   -- here we will need a caster, but BinStorage does not have
it, we will fail
{code}

> BinStorage cannot convert DataByteArray to Chararray and results in FIELD_DISCARDED_TYPE_CONVERSION_FAILED
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-1341
>                 URL: https://issues.apache.org/jira/browse/PIG-1341
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Richard Ding
>             Fix For: 0.7.0
>
>         Attachments: PIG-1341.patch
>
>
> Script reads in BinStorage data and tries to convert a column which is in DataByteArray
to Chararray. 
> {code}
> raw = load 'sampledata' using BinStorage() as (col1,col2, col3);
> --filter out null columns
> A = filter raw by col1#'bcookie' is not null;
> B = foreach A generate col1#'bcookie'  as reqcolumn;
> describe B;
> --B: {regcolumn: bytearray}
> X = limit B 5;
> dump X;
> B = foreach A generate (chararray)col1#'bcookie'  as convertedcol;
> describe B;
> --B: {convertedcol: chararray}
> X = limit B 5;
> dump X;
> {code}
> The first dump produces:
> (36co9b55onr8s)
> (36co9b55onr8s)
> (36hilul5oo1q1)
> (36hilul5oo1q1)
> (36l4cj15ooa8a)
> The second dump produces:
> ()
> ()
> ()
> ()
> ()
> It also throws an error message: FIELD_DISCARDED_TYPE_CONVERSION_FAILED 5 time(s).
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message