hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chandeep Singh ...@chandeep.com>
Subject Re: Field delimiter in hive
Date Tue, 08 Mar 2016 11:56:12 GMT
I’ve been pretty successful with two pipes (||) or two carets (^^) based on my dataset even
though they aren’t unicode.

> On Mar 7, 2016, at 8:32 PM, mahender bigdata <Mahender.BigData@outlook.com> wrote:
> 
> Any help on this.
> 
> On 3/3/2016 2:38 PM, mahender bigdata wrote:
>> Hi,
>> 
>> I'm bit confused to know which character should be taken as delimiter for hive table
generically. Can any one suggest me best Unicode character which doesn't come has part of
data.
>> 
>> Here are the couple of options, Im thinking off for Field Delimiter. Please let me
know which is best one use and chance of that character ( i.e delimiter ) in data is less
in day to day scenario..
>> 
>> \U0001  = START OF HEADING ==> SOH  ==> ( CTRL+SHIFT+A in windows)  ==>
Hive Default delimiter
>> 
>> 
>> \U001F  = INFORMATION SEPARATOR ONE = unit separator (US)  => ( CTRL+SHIFT+
- in windows)
>> 
>> 
>> \U001E  = INFORMATION SEPARATOR TWO = record separator (RS) ==> ( CTRL+SHIFT+6
in windows)
>> 
>> Some how by name i feel \U001F is best option, can any one comment or provide best
Unicode which doesn't in regular data.
>> 
>> 
>> 
> 


Mime
View raw message