flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: HBase TableOutputFormat
Date Wed, 25 Mar 2015 10:42:34 GMT
Hi Flavio,

1) the parameters you set to the configuration object in the main method
should be available in the JM and TMs. The OutputFormat object is
serialized at the client-side, sent to JM and the TMs, and deserialized.
Therefore, all information that was set in the main() should be there in
the JM and TMs.
2) configure() is always called before the OutputFormat is used (on JM and
TM). This would be the right place to put your code to configure the
wrapped Hadoop OF.
3) I think this is only necessary for file-based wrapped Hadoop OFs. The
HBase format should not require that.
4) This should be fixed in 0.9 and 0.8.2, IMO.

Cheers, Fabian

2015-03-23 22:19 GMT+01:00 Flavio Pompermaier <pompermaier@okkam.it>:

> No I haven't. There are some points that are not clear to me:
>
> 1) why the parameters I set in the job configuration get lost when
> arriving to the job and task managers?
> 2)do you think I should put the setConf in the configure method?what is
> the lifecycle of the Outputformat?
> 3)is it really necessary to set the mapreduce.output.dir?is it a standard
> approach for Hadoop compatibility?
> 4)is it better to make a pr fir Flink 0.9 or 0.8.2?when are they going to
> be released (more or less)?
>
> Best,
> Flavio
>

Mime
View raw message