hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Olga Natkovich (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-504) Illustrate and Dump do not seem to work correctly for files containing utf8
Date Tue, 21 Oct 2008 16:05:44 GMT

    [ https://issues.apache.org/jira/browse/PIG-504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641417#action_12641417
] 

Olga Natkovich commented on PIG-504:
------------------------------------

Shubham: regarding (1) illustrate should be doing the same thing as describe. If you look
at describe, you will see that it would say chararray

> Illustrate and Dump do not seem to work correctly for files containing utf8
> ---------------------------------------------------------------------------
>
>                 Key: PIG-504
>                 URL: https://issues.apache.org/jira/browse/PIG-504
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: types_branch
>         Environment: Hadoop 18
>            Reporter: Viraj Bhat
>         Attachments: 504.patch, utf8.txt
>
>
> For the snippet of code which runs on the latest types branch. (utf8.txt attached)
> {code}
> A = load 'utf8.txt' using PigStorage() as (t1: chararray);
> illustrate A;
> {code}
> results in this output being produced
> -------------------------------
> | A     | t1: bytearray cn: 1 | 
> -------------------------------
> |       | gabriella??         | 
> -------------------------------
> Three observations:
> 1) text should be chararray, not bytearray.
> 2) cn: 1 should be removed from the display
> 3) Value for text is "username??" is not displayed properly
> Now replacing illustrate with dump
> {code}
> A = load 'utf8.txt' using PigStorage() as (t1: chararray);
> dump A;
> {code}
> (david?)
> (rachel?)
> (jessica?)
> (sarah?)
> (katie?)
> (wendy?)
> (david?)
> (priscilla?)
> (oscar?)
> (xavier?)
> ..some more. 
> The utf8 characters after username are not displayed correctly but instead substituted
by ?.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message