hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fran O <franob...@gmail.com>
Subject HFILE creation to use a different committer
Date Thu, 16 Mar 2017 16:29:28 GMT
Hi folks,

I would like to hear some thoughts on the following use case:

I use a custom MR job to create HFiles . This MR writes the HFiles into S3.

I was trying to change the Outputcommitter in order to have the reducers
writing directly the HFiles to the final destination on S3. After some
tests setting the Outputcommitter to be the DirectoOutputcommitter, the
tasks are always using the FileOutputCommitter.

>> HFileOutputFormat2.configureIncrementalLoad(job, hTable);
>> FileOutputFormat.setOutputPath(job, outputPath);
>> FileOutputFormat.setCompressOutput(job, true);
>> FileOutputFormat.setOutputCompressorClass(job, >>SnappyCodec.class);

Looking at the code of the FileOutputFormat methods
<https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.html>
I see a *getOutputCommitter
<https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.html#getOutputCommitter(org.apache.hadoop.mapreduce.TaskAttemptContext)>
*method
but not a set method for the OutputCommitter.

Could someone bring some light on how to change the OutputCommitter for the
tasks?

Thank you,
Fran

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message