hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aleksei S (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-11721) non-ascii characters shows improper with "insert into"
Date Sun, 18 Oct 2015 05:40:05 GMT

     [ https://issues.apache.org/jira/browse/HIVE-11721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aleksei S updated HIVE-11721:
-----------------------------
    Attachment: HIVE-11721.patch

I debugged the issue and found that the reason is that the contents of a virtual table is
written as bytes while keeping only lower 8 bits, which doesn't work with non-ascii characters.
The fix is to create a Text object (which is used as a virtual table storage format) and encode
values with it.

> non-ascii characters shows improper with "insert into"
> ------------------------------------------------------
>
>                 Key: HIVE-11721
>                 URL: https://issues.apache.org/jira/browse/HIVE-11721
>             Project: Hive
>          Issue Type: Bug
>          Components: Database/Schema
>    Affects Versions: 1.1.0, 1.2.1, 2.0.0
>            Reporter: Jun Yin
>         Attachments: HIVE-11721.patch
>
>
> Hive: 1.1.0
> hive> create table char_255_noascii as select cast("Garçu 谢谢 Kôkaku ありがとうございますkidôtai한국어"
as char(255));
> hive> select * from char_255_noascii;
> OK
> Garçu 谢谢 Kôkaku ありがとうございますkidôtai>한국어
> it shows correct, and also it works good with "LOAD DATA" 
> but when I try another way to insert data as below:
> hive> create table nonascii(t1 char(255));
> OK
> Time taken: 0.125 seconds
> hive> insert into nonascii values("Garçu 谢谢 Kôkaku ありがとうございますkidôtai한국어");
> hive> select * from nonascii;
> OK
> Gar�u "" K�kaku B�LhFTVD~Ykid�tai\m� 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message