hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ujjwal Wadhawan <uwadha...@gmail.com>
Subject binary column data consistency in hive table copy
Date Mon, 14 Sep 2015 21:32:55 GMT
Hi all,



I recently observed a behavior in hive that I’ll like to share and get
inputs.



*Scenario:*



Say you have a hive table with a binary column.



create table binsource (bincol binary);



and some input data



$ cat /nis3/home/ujjwal2/test2/binin

10000101

121

10

1011

Asfs





Let’s load the data in the table



LOAD DATA LOCAL INPATH '/home/ujjwal2/test2/binin' OVERWRITE INTO TABLE
binsource;



When I do a select * on hive CLI, I see following characters (see image)


[image: http://puu.sh/k6HBw/877367d595.png]



The underlying HDFS file still has the actual input though.





Now I make a copy of this table using command "create table
ujjwal2.bintarget as select * from ujjwal2.binsource;".


[image: http://puu.sh/k6HEj/b34a8bd4a0.png]



*ISSUE:*


Now when I see the underlying file create on HDFS for bintarget, I see some
extra characters.





In may combinations I have tried, the extra characters are in “=”, “w” and
“A”.


10000101

120=

1w==

1011

Asfs


Does anyone know what these characters signify ?



Best,

Ujjwal

Mime
View raw message