flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Huelfenhaus <m.huelfenh...@davengo.com>
Subject Re: Udf Performance and Object Creation
Date Thu, 13 Aug 2015 08:11:15 GMT
Hey Timo,

yes that is what I needed to know.

Thanks
- Michael

Am 12.08.2015 um 12:44 schrieb Timo Walther <twalthr@apache.org>:

> Hello Michael,
> 
> every time you code a Java program you should avoid object creation if you want an efficient
program, because every created object needs to be garbage collected later (which slows down
your program performance).
> You can have small Pojos, just try to avoid the call "new" in your functions:
> 
> Instead of:
> 
> class Mapper implements MapFunction<String,Pojo> {
> public Pojo map(String s) {
>    Pojo p = new Pojo();
>    p.f = s;
> }
> }
> 
> do:
> 
> class Mapper implements MapFunction<String,Pojo> {
> private Pojo p = new Pojo();
> public Pojo map(String s) {
>    p.f = s;
> }
> }
> 
> Then an object is only created once per Mapper and not per record.
> 
> Hope this helps.
> 
> Regards,
> Timo
> 
> 
> 
> On 12.08.2015 11:53, Michael Huelfenhaus wrote:
>> Hello
>> 
>> I have a question about the programming of user defined functions, is it still like
in old Stratosphere times the case that object creation should be avoided al all cost? Because
in some of the examples there are now Tuples and other objects created before returning them.
>> 
>> I gonna have an at least 6 step streaming plan and I am going to use Pojos. Is it
performance wise a big improvement to define one big pojo that can be used by all the steps
or better to have smaller ones to send less data but create more objects.
>> 
>> Thanks
>> Michael
> 


Mime
View raw message