hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Burch <>
Subject Importing external data into a RCFile table, without going via CSV?
Date Tue, 03 Mar 2015 11:01:20 GMT
Hi All

I'm currently looking to get some data into a rcfile stored table in Hive. 
The data is in a format (SAS) from which I can read it in Java, but not 
one that Hive supports, nor one I can get any sensibly priced converter 

Having spent some time reading docs and blogs, and trying a few bits out, 
it seems I can write a small Java program to convert the source data into 
a CSV, then perform a LOAD DATA call to get it into a delimited text file 
table in Hive, then select/insert from that into my rcfile table. That's 
more steps than I'd like though

I've seen some things about using Pig and HCatalog to generate Hive tables 
using rcfile storage which look similar to what I want, except that I'm 
not using Pig so that means some more steps again.

Reading the javadocs for, and 
some code that uses them, it seems I could write a converter in Java, but 
it doesn't look that much fun.

That leads me to wonder:
* Are there any higher level java libraries for writing into an rcfile,
   standalone or in hive?
* Are there better data flows than (binary file) -> csv -> temp hive
   table -> target hive rcfile table?


View raw message