flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: Hadoop compatibility and HBase bulk loading
Date Fri, 10 Apr 2015 10:02:29 GMT
We had an effort to execute any HadoopMR program by simply specifying the
JobConf and execute it (even embedded in regular Flink programs).
We got quite far but did not complete (counters and custom grouping /
sorting functions for Combiners are missing if I remember correctly).
I don't think that anybody is working on that right now, but it would
definitely be a cool feature.

2015-04-10 11:55 GMT+02:00 Flavio Pompermaier <pompermaier@okkam.it>:

> Hi guys,
> I have a nice question about Hadoop compatibility.
> In https://flink.apache.org/news/2014/11/18/hadoop-compatibility.html you
> say that you can reuse existing mapreduce programs.
> Could it be possible to manage also complex mapreduce programs like HBase
> BulkImport that use for example a custom partioner
> (org.apache.hadoop.mapreduce.Partitioner)?
> In the bulk-import examples the call
> HFileOutputFormat2.configureIncrementalLoadMap that sets a series of job
> parameters (like partitioner, mapper, reducers, etc) ->
> http://pastebin.com/8VXjYAEf.
> The full code of it can be seen at
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat2.java
> .
> Do you think there's any change to make it run in flink?
> Best,
> Flavio

View raw message