hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ShaoFeng Shi <shaofeng...@apache.org>
Subject Re: HFileOutputFormat2 hardcodes default FileOutputCommitter
Date Wed, 27 Sep 2017 01:36:08 GMT
JIRA is created, and a patch is attached:
https://issues.apache.org/jira/browse/HBASE-18885

Please review and merge, we need this in the future version. Thanks.

2017-09-26 19:00 GMT+08:00 ShaoFeng Shi <shaofengshi@apache.org>:

> Here is the pull request:
>
> https://github.com/apache/hbase/pull/60
>
> 2017-09-26 17:16 GMT+08:00 ShaoFeng Shi <shaofengshi@apache.org>:
>
>> Hello gentlemen,
>>
>> This is Shaofeng Shi from Apache Kylin community, we use HBase as the
>> storage engine, and we use MR job to generate HFile before bulk load. We
>> received user reporting that, if configured to use S3 as the output
>> location for HFile, the files were generated in "_temporary" folder and
>> won't be committed to the target path. This caused no data be loaded
>> finally. And we can reproduce this problem easily. The original reporting
>> is in [1].
>>
>> Kylin uses HBase's HFileOutputFormat2.java to configure the MR job. After
>> some investigation, I found this class always uses the default
>> "FileOutputCommitter", see [2], regardless of the job's configuration; so
>> it always writing to "_temporary" folder. Since AWS EMR configured to use
>> DirectOutputCommitter for S3, then this problem occurs: Hadoop expects to
>> see the file directly under output path, while the RecordWriter generates
>> them in "_temporary" folder.
>>
>> Did you get such reporting before? I had a temporary fix in my fork now.
>> Just wondering how you think about it; if oaky I would report a JIRA.
>> Thanks!
>>
>> [1] https://issues.apache.org/jira/browse/KYLIN-2788
>> [2] https://github.com/apache/hbase/blob/master/hbase-mapreduce/
>> src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOu
>> tputFormat2.java#L193
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>


-- 
Best regards,

Shaofeng Shi 史少锋

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message