hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luke Lovett <luke.lov...@10gen.com>
Subject CombineHiveInputFormat does not call getSplits on custom InputFormat
Date Thu, 19 Feb 2015 18:09:37 GMT
I'm working on defining a custom InputFormat and OutputFormat for use 
with Hive. I'd like tables using these IF/OF to be native tables, so 
that I can LOAD DATA and INSERT INTO them. However, I'm finding that 
with the default CombineHiveInputFormat, the getSplits method of my 
InputFormat is not being called. If I "set 
hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;", then 
getSplits is called.

What I want to know is:
- Is this difference in behavior between CombineHiveInputFormat and 
HiveInputFormat intentional?
- Is there any way of forcing CombineHiveInputFormat to call getSplits 
on my own InputFormat? I was reading through the code for 
CombineHiveInputFormat, and it looks like it might only call my own 
InputFormat's getSplits method if the table is non-native. I'm not sure 
if I'm interpreting this correctly.
- Is it better to set "hive.input.format" to work around this, or to 
create a StorageHandler and make non-native tables?

Thanks for any advice.


Mime
View raw message