flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Pompermaier <pomperma...@okkam.it>
Subject Re: Hadoop compatibility and HBase bulk loading
Date Fri, 10 Apr 2015 10:07:31 GMT
I think I could also take care of it if somebody can help me and guide me a
little bit..
How long do you think it will require to complete such a task?

On Fri, Apr 10, 2015 at 12:02 PM, Fabian Hueske <fhueske@gmail.com> wrote:

> We had an effort to execute any HadoopMR program by simply specifying the
> JobConf and execute it (even embedded in regular Flink programs).
> We got quite far but did not complete (counters and custom grouping /
> sorting functions for Combiners are missing if I remember correctly).
> I don't think that anybody is working on that right now, but it would
> definitely be a cool feature.
>
> 2015-04-10 11:55 GMT+02:00 Flavio Pompermaier <pompermaier@okkam.it>:
>
>> Hi guys,
>>
>> I have a nice question about Hadoop compatibility.
>> In https://flink.apache.org/news/2014/11/18/hadoop-compatibility.html
>> you say that you can reuse existing mapreduce programs.
>> Could it be possible to manage also complex mapreduce programs like HBase
>> BulkImport that use for example a custom partioner
>> (org.apache.hadoop.mapreduce.Partitioner)?
>>
>> In the bulk-import examples the call
>> HFileOutputFormat2.configureIncrementalLoadMap that sets a series of job
>> parameters (like partitioner, mapper, reducers, etc) ->
>> http://pastebin.com/8VXjYAEf.
>> The full code of it can be seen at
>> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat2.java
>> .
>>
>> Do you think there's any change to make it run in flink?
>>
>> Best,
>> Flavio
>>
>
>

Mime
View raw message