flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gyula Fóra <gyf...@apache.org>
Subject Re: Question about Flink Streaming
Date Tue, 24 Mar 2015 16:38:01 GMT
Hey Matthias,

Let's see if I get these things for you :)

1) The difference between setup and open is that, setup to set things like
collectors, runtimecontext and everything that will be used by the
implemented invokable, and also by the rich functions. Open is called after
setup, to actually open the execution of the UDF operator.

2) Close is always called, even when the task is cancelled. In addition
when a task is failing (maybe because other tasks are failing) the cancel
method is called and the main thread is interrupted. The point of having a
cancel method that some invokables might require different shutdown logic
in case of failure.

3) This I need to look into...

4) You are right about the unintuitive name here, if you could open a JIRA
for this I would appreciate that :)

5) You are absolutely right on this point, we need to spend more effort on
writing proper docs.

I hope I could clarify some stuff.


On Tue, Mar 24, 2015 at 5:05 PM, Matthias J. Sax <
mjsax@informatik.hu-berlin.de> wrote:

> Hi,
> as I get more familiar with Flink streaming and do some coding, I hit a
> few points which I want do discuss about because I find them
> contra-intuitive. Please tell me, what you think about it or clarify
> what I misunderstood.
> 1) In class StreamInvokable has two methods .setup(...) and .open(...)
>    -> what is the difference between both? When is each of both called
> exactly? It seems to be, that both are used to setup an operator. Why
> can't they be unified?
> 2) The same question about .close() and .cancel() ?
> 3) There is an class/interface hierarchy for user defined functions. The
> top level interface is 'Function' and there is an interface
> 'RichFunction' and abstract class 'AbstractRichFunction'. For each
> different type, there are user functions derived from. So far so good.
>         However, the StreamInvokable class only takes a constructor
> argument
> Function, indicating that RichFunctions are not supported. Internally,
> the given function is tested to be a RichFunction (using instanceof) at
> certain places. This in contra-intuitive from a API point of view.
>         From my OO understanding it would be better to replace Function by
> RichFunction everywhere. However, I was told that the (empty) Function
> interface is necessary for lambda expressions. Thus, I would suggest to
> extend the API with methods taking a RichFunction a parameter so it is
> clear that those are supported, too.
> 4) There is the interface Timestamp that is used to extract a time stamp
> for a record on order to create windows on a record attribute. I think
> the name "Timestamp" is miss leading, because the class does not
> represent a time stamp. I would rather call the interface
> "TimestampExtractor" or something similar.
> 5) Stefan started the discussion about more tests for the streaming
> component. I would additionally suggest to improve the Javadoc
> documentation. The are many classes an method with missing or very brief
> documentation and it is ofter hard to guess what they are used for. I
> would also suggest to describe the interaction of components/classes and
> WHY some thing are implemented in a certain way. As I have background
> knowledge from Stratosphere, I personally can work around it and make
> sense out of it (at least most times). However, for new contributers it
> might be very hard to make sense out of it and to get started
> implementing new features.
> Cheers,
>   Matthias

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message