flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Pompermaier <pomperma...@okkam.it>
Subject Re: Thrift object serialization
Date Tue, 16 May 2017 07:26:07 GMT
Hi Gordon,
thanks for the link. Will the usage ofTBaseSerializer wrt Kryo lead to a
performance gain?

On Tue, May 16, 2017 at 7:32 AM, Tzu-Li (Gordon) Tai <tzulitai@apache.org>
wrote:

> Hi Flavio!
>
> I believe [1] has what you are looking for. Have you taken a look at that?
>
> Cheers,
> Gordon
>
> [1] https://ci.apache.org/projects/flink/flink-docs-
> release-1.3/dev/custom_serializers.html
>
> On 15 May 2017 at 9:08:33 PM, Flavio Pompermaier (pompermaier@okkam.it)
> wrote:
>
> Hi to all,
> in my Flink job I create a Dataset<MyThriftObj> using HadoopInputFormat in
> this way:
>
> HadoopInputFormat<Void, MyThriftObj> inputFormat = new HadoopInputFormat<>(
>         new ParquetThriftInputFormat<MyThriftObj>(), Void.class,
> MyThriftObj.class, job);
> FileInputFormat.addInputPath(job,  new org.apache.hadoop.fs.Path(
> inputPath);
> *DataSet<Tuple2<Void, MyThriftObj>> ds* = env.createInput(inputFormat);
>
> Flink logs this message:
>
>    - TypeExtractor - *class MyThriftObj contains custom serialization
>    methods we do not call.*
>
>
> Indeed MyThriftObj has readObject/writeObject functions and when I print
> the type of ds I see:
>
>    - Java Tuple2<Void, *GenericType<MyThriftObj>*>
>
> Fom my experience GenericType is a performace killer...what should I do to
> improve the reading/writing of MyThriftObj?
>
> Best,
> Flavio
>
>
> --
> Flavio Pompermaier
> Development Department
>
> OKKAM S.r.l.
> Tel. +(39) 0461 1823908 <+39%200461%20182%203908>
>
>


-- 
Flavio Pompermaier
Development Department

OKKAM S.r.l.
Tel. +(39) 0461 1823908

Mime
View raw message