hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ayon Sinha <ayonsi...@yahoo.com>
Subject URGENT: Hive not respecting escaped delimiter characters
Date Tue, 26 Jul 2011 21:13:39 GMT
We have database dumps with TAB delimiters. The fields with TAB have them escaped in the dumped
text file. But HIVE does not respect escaped delimiters so 
create external table scratch.delete_me (a int, b int, c bigint, d string, e int) ROW FORMAT
DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE LOCATION '/tmp/users';


creates rows with value of e as NULL for some rows.

Hive also does not allow multi-character delimiters for ROW FORMAT DELIMITED spec. 

What is the cleanest way to get past this problem? Options are:
1. Write custom SerDe class
2. Use RegexSerde
3. Remove escaped delimiter chars from data

I need to know the roadblocks before I invest time on any one of them.
 
-Ayon
See My Photos on Flickr
Also check out my Blog for answers to commonly asked questions.

Mime
View raw message