flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gábor Horváth <xazax....@gmail.com>
Subject Re: Tuple performance and the curious JIT compiler
Date Tue, 08 Mar 2016 10:22:34 GMT
Hi!

I am planning to do GSoC and I would like to work on the serializers. More
specifically I would like to implement code generation. I am planning to
send the first draft of the proposal to the mailing list early next week.
If everything is going well, that will include some preliminary benchmarks
how much performance gain can be expected from hand written serializers.

Best regards,
Gábor

On 8 March 2016 at 10:47, Stephan Ewen <sewen@apache.org> wrote:

> Ah, very good, that makes sense!
>
> I would guess that this performance difference could probably be seen at
> various points where generic serializers and comparators are used (also for
> Comparable, Writable) or
> where the TupleSerializer delegates to a sequence of other TypeSerializers.
>
> I guess creating more specialized serializers would solve some of these
> problems, like in your IntValue vs LongValue case.
>
> The best way to solve that would probably be through code generation in the
> serializers. That has actually been my wish for quite a while.
> If you are also into these kinds of low-level performance topics, we could
> start a discussion on that.
>
> Greetings,
> Stephan
>
>
> On Mon, Mar 7, 2016 at 11:25 PM, Greg Hogan <code@greghogan.com> wrote:
>
> > The issue is not with the Tuple hierarchy (running Gelly examples had no
> > effect on runtime, and as you note there aren't any subclass overrides)
> but
> > with CopyableValue. I had been using IntValue exclusively but had
> switched
> > to using LongValue for graph generation. CopyableValueComparator and
> > CopyableValueSerializer are now working with multiple types.
> >
> > If I create IntValue- and LongValue-specific versions of
> > CopyableValueComparator and CopyableValueSerializer and modify
> > ValueTypeInfo to return these then I see the expected performance.
> >
> > Greg
> >
> > On Mon, Mar 7, 2016 at 5:18 AM, Stephan Ewen <sewen@apache.org> wrote:
> >
> > > Hi Greg!
> > >
> > > Sounds very interesting.
> > >
> > > Do you have a hunch what "virtual" Tuple methods are being used that
> > become
> > > less jit-able? In many cases, tuples use only field accesses (like
> > > "vakle.f1") in the user functions.
> > >
> > > I have to dig into the serializers, to see if they could suffer from
> > that.
> > > The "getField(pos)" method for example should always have many
> overrides
> > > (though few would be loaded at any time, because one usually does not
> use
> > > all Tuple classes at the same time).
> > >
> > > Greetings,
> > > Stephan
> > >
> > >
> > > On Fri, Mar 4, 2016 at 11:37 PM, Greg Hogan <code@greghogan.com>
> wrote:
> > >
> > > > I am noticing what looks like the same drop-off in performance when
> > > > introducing TupleN subclasses as expressed in "Understanding the JIT
> > and
> > > > tuning the implementation" [1].
> > > >
> > > > I start my single-node cluster, run an algorithm which relies purely
> on
> > > > Tuples, and measure the runtime. I execute a separate jar which
> > executes
> > > > essentially the same algorithm but using Gelly's Edge (which
> subclasses
> > > > Tuple3 but does not add any extra fields) and now both the Tuple and
> > Edge
> > > > algorithms take twice as long.
> > > >
> > > > Has this been previously discussed? If not I can work up a
> > demonstration.
> > > >
> > > > [1] https://flink.apache.org/news/2015/09/16/off-heap-memory.html
> > > >
> > > > Greg
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message