hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lavelle, Shawn" <>
Subject RE: Implementing a custom StorageHandler
Date Wed, 29 Jun 2016 21:32:23 GMT
I don’t have answers for you, except for #1 – mapreduce are the new classes in Hadoop,
from my understanding.  They’ve been out for a while, but the Hive storage handler API hasn’t
been updated to make use of them.  Which leads me to my very related question: When might
hive provide a storage handler interface that uses the new classes, and if not, why not?


~ Shawn M Lavelle

From: Long, Andrew []
Sent: Monday, June 27, 2016 5:59 PM
To: user <>
Subject: Implementing a custom StorageHandler

Hello everyone,

I’m in the process of implementing a custom StorageHandler and I had some questions.

1)      What is the difference between org.apache.Hadoop.mapred.InputFormat and org.apache.hadoop.mapreduce.InputFormat?

2)      How is numSpits calculated in org.apache.Hadoop.mapred.InputFormat.getSplits(JobConf
job, int numSplits)?

3)      Is there a way to enforce a maximum number of splits?  What would happen if I ignore
numSplits and just returned an array of splits that was the actual maximum number of splits?

4)      How is InputSplit.getLocations() used?  If I’m accessing non hfds resources should
what should I return?  Currently I’m just returning an empty array.

Thanks for your time,
Andrew Long
View raw message