flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ufuk Celebi <...@apache.org>
Subject Re: Replacing JobManager with Scala implementation
Date Wed, 03 Sep 2014 12:06:19 GMT
Hey Till,

I'm not sure what the "right" ASF process is, but I wouldn't mind a vote on
this in order to make sure that you don't do unnecessary work by replacing
the code with Scala.

I for one would be certainly open to it. The only thing that bothers me is
the current state of out-of-the-box IDE support. But since there are other
successful Scala projects around ;-), which manage to do it, why shouldn't
we?

@Henry, regarding Akka: I think the main motiviation for moving to Akka
(besides the points raised by Stephan and others) is that we actually don't
want to bother with low-level thread management, protocols, etc.



On Tue, Sep 2, 2014 at 8:32 PM, Henry Saputra <henry.saputra@gmail.com>
wrote:

> HI Till,
>
> Thanks for opening the discussions and lead the effort and apologize
> for late response.
>
> From what I have gathered so far, there are 2 issues:
> 1. Introducing Akka as RPC
> 2. Moving to Scala to enable easy access to Akka Scala APIs.
>
> For no1, if the RPC us used for lower level communications then we
> could probably consider Netty as the transport and serialization
> protocol (I also have added comment to the JIRA).
> Internally, to reduce thread management we could use Akka via Scala
> bridge service to make sure we use Scala Akka APIs.
>
> So addressing no 2, we could mix both Scala and Java in JobManager and
> TaskManager. the code that handle async RPC communications between JM
> and TM are using Java via Netty, and internal multi-threads or higher
> level plane code such as heart beat we could use Akka.
>
> It does introduce a bit mix between Java and Scala code but we already
> have mix of Scala and Java to support APIs so I think we could move
> some the internal code to use Scala too as "learning" steps to utilize
> Scala for better multi concurrency/ functional programming.
>
> - Henry
>
>
>
> On Sun, Aug 31, 2014 at 4:31 AM, Till Rohrmann <till.rohrmann@gmail.com>
> wrote:
> > Hi Daniel,
> >
> > the RPC rework is discussed in
> > https://issues.apache.org/jira/browse/FLINK-1019. Jira is currently down
> > due to maintenance reasons.
> >
> > The ideas to use akka are the following. Akka allows us to reduce the
> code
> > base which has to be maintained. Especially, we get rid of all the
> > multi-threading programming of the rpc service which is always hard to
> work
> > with. With Akka we would get the heartbeat signal for free, because Akka
> > can detect dead actors. Akka uses supervision to handle fault tolerance
> as
> > well as recovery and it allows an easy forwarding of remote exceptions.
> At
> > the same time it offers a nice rpc abstraction which easily allows to
> > implement asynchronous services. Furthermore, it scales rather well to
> > large numbers of nodes and hopefully we get the latencies of Flink a
> little
> > bit down.
> >
> > Bests,
> >
> > Till
> >
> >
> > On Sun, Aug 31, 2014 at 11:35 AM, Daniel Warneke <warneke@apache.org>
> wrote:
> >
> >> Hi,
> >>
> >> will akka just be used for RPC or are there any plans to expand the
> >> actor-based model to further parts of the runtime system? If so, could
> you
> >> please point me to the discussion thread?
> >>
> >> Spontaneously, I would say that adding a hard dependency on Scala just
> for
> >> the sake of having a hip RPC service sounds like a pretty dodgy deal.
> >> Therefore, I would like understand how much value akka could bring to
> Flink
> >> in the long run. The discussion whether to reimplement core components
> of
> >> the system in Scala should be the second step in my opinion.
> >>
> >> Bests,
> >>
> >>     Daniel
> >>
> >>
> >> Am 29.08.2014 11:33, schrieb Asterios Katsifodimos:
> >>
> >>  I agree that using Akka's actors from Java results in very ugly code.
> >>> Hiding the internals of Akka behind Java reflection looks better but
> >>> breaks
> >>> the principles of actors. For me it is kind of a deal breaker for using
> >>> Akka from Java.  I think that Till has more reasons to believe that
> Scala
> >>> would be a more appropriate for building a new Job/Task Manager.
> >>>
> >>> I think that this discussion should focus on 4 main aspects:
> >>> 1. Performance
> >>> 2. Implementability
> >>> 3. Maintainability
> >>> 4. Available Tools
> >>>
> >>> 1. Performance: Since that the job of the JobManager and the
> TaskManager
> >>> is
> >>> to 1) exchange messages in order to maintain a distributed state
> machine
> >>> and 2) setup connections between task managers, 3) detect failures
> etc..
> >>> In
> >>> these basic operations, performance should not be an issue. Akka was
> >>> proven
> >>> to scale quite well with very low latency. I guess that the low level
> >>> "plumbing" (serialization, connections, etc.) will continue in Java in
> >>> order to guarantee high performance. I have no clue on what's happening
> >>> with memory management and whether this will be implemented in Java or
> >>> Scala and the respective consequences. Please comment.
> >>>
> >>> 2. Since the Job/Task Manager is going to be essentially implemented
> from
> >>> scratch, given the power of Akka, it seems to me that the
> implementation
> >>> will be   easier, shorter and less verbose in Scala, given that Till is
> >>> comfortable enough with Scala.
> >>>
> >>> 3. Given #2, maintaining the code and trying out new ideas in Scala
> would
> >>> take less time and effort. But maintaining low level plumbing in Java
> and
> >>> high level logic in Scala scares me. Anyone that has done this before
> >>> could
> >>> comment on this?
> >>>
> >>> 4. Tools: Robert has raised some issues already but I think that tools
> >>> will
> >>> get better with time.
> >>>
> >>> Given the above, I would focus on #3 to be honest. Apart from this,
> going
> >>> the Scala way sounds like a great idea. I really second Kostas' opinion
> >>> that if large changes are going to happen, this is the best moment.
> >>>
> >>> Cheers,
> >>> Asterios
> >>>
> >>>
> >>>
> >>> On Fri, Aug 29, 2014 at 1:02 AM, Till Rohrmann <
> till.rohrmann@gmail.com>
> >>> wrote:
> >>>
> >>>  I also agree with Robert and Kostas that it has to be a community
> >>>> decision.
> >>>> I understand the problems with Eclipse and the Scala IDE which is a
> pain
> >>>> in
> >>>> the ass. But eventually these things will be fixed. Maybe we could
> also
> >>>> talk to the typesafe guy and tell him that this problem bothers us a
> lot.
> >>>>
> >>>> I also believe that the project is not about a specific programming
> >>>> language but a problem we want to tackle with Flink. From time to
> time it
> >>>> might be necessary to adapt the tools in order to reach the goal. In
> >>>> fact,
> >>>> I don't believe that Scala parts would drive people away from the
> >>>> project.
> >>>> Instead, Scala enthusiasts would be motivated to join us.
> >>>>
> >>>> Actually I stumbled across a quote of Leibniz which put's my point of
> >>>> view
> >>>> quite accurately in a nutshell:
> >>>>
> >>>> In symbols one observes an advantage in discovery which is greatest
> when
> >>>> they express the exact nature of a thing briefly and, as it were,
> picture
> >>>> it; then indeed the labor of thought is wonderfully diminished --
> >>>> Gottfried
> >>>> Wilhelm Leibniz
> >>>>
> >>>>
> >>>> On Thu, Aug 28, 2014 at 12:57 PM, Kostas Tzoumas <ktzoumas@apache.org
> >
> >>>> wrote:
> >>>>
> >>>>  On Thu, Aug 28, 2014 at 11:49 AM, Robert Metzger <
> rmetzger@apache.org>
> >>>>> wrote:
> >>>>>
> >>>>>  Changing the programming language of a very important system
> component
> >>>>>>
> >>>>> is
> >>>>
> >>>>> something we should carefully discuss.
> >>>>>>
> >>>>>>  Definitely agree, I think the community should discuss this
very
> >>>>>
> >>>> carefully.
> >>>>
> >>>>>
> >>>>>  I understand that Akka is written in Scala and that it will be
much
> >>>>>>
> >>>>> more
> >>>>
> >>>>> natural to implement the actor based system using Scala.
> >>>>>> I see the following issues that we should consider:
> >>>>>> Until now, Flink is clearly a project implemented only in Java.
The
> >>>>>>
> >>>>> Scala
> >>>>
> >>>>> API basically sits on top of the Java-based runtime. We do not really
> >>>>>> depend on Scala (we could easily remove the Scala API if we
want
> to).
> >>>>>> Having code written in Scala in the main system will add a hard
> >>>>>>
> >>>>> dependency
> >>>>>
> >>>>>> on a scala version.
> >>>>>> Being a pure Java project has some advantages: I think its a
fact
> that
> >>>>>> there are more Java programmers than Scala programmers. So our
> chances
> >>>>>>
> >>>>> of
> >>>>
> >>>>> attracting new contributors are higher when being a Java project.
> >>>>>> On the other hand, we could maybe attract Scala developers to
our
> >>>>>>
> >>>>> project.
> >>>>>
> >>>>>> But that has not happened (for contributors, not users!) so
far for
> our
> >>>>>> Scala API, so I don't see any reason for that to happen.
> >>>>>>
> >>>>>>
> >>>>>>  This is definitely an issue to consider. We need to carefully
> weight
> >>>>> how
> >>>>> important this issue is. If we want to break things, incubation
is
> the
> >>>>> right time to do it. Below are some arguments in favor of breaking
> >>>>>
> >>>> things,
> >>>>
> >>>>> but do keep in mind that I am undecided, and I would really like
to
> see
> >>>>>
> >>>> the
> >>>>
> >>>>> community weighing in.
> >>>>>
> >>>>> First, I would dare say that the primary reason for someone to
> >>>>> contribute
> >>>>> to Flink so far has not been that the code is written in Java, but
> more
> >>>>>
> >>>> the
> >>>>
> >>>>> content and nature of the project. Most contributors are Big Data
> >>>>> enthusiasts in some way or another.
> >>>>>
> >>>>> Second, Scala projects have attracted contributors in the past.
> >>>>>
> >>>>> Third, it should not be too hard for someone that does not know
> Scala to
> >>>>> contribute to a different component if the interfaces are clear.
> >>>>>
> >>>>>
> >>>>>  Another issue is tooling: There are a lot of problems with Scala
and
> >>>>>> Eclipse: I've recently switched to Eclipse Luna. It seems to
be
> >>>>>>
> >>>>> impossible
> >>>>>
> >>>>>> to compile Scala code with Luna because ScalaIDE does not properly
> cope
> >>>>>> with it.
> >>>>>> Even with Eclipse versions that are supported by ScalaIDE, you
have
> to
> >>>>>> manually install 3 plugins, some of them are not available in
the
> >>>>>>
> >>>>> Eclipse
> >>>>
> >>>>> Marketplace. So with a JobManager written in Scala, users can not
> just
> >>>>>> import our project as a Maven project into Eclipse and start
> >>>>>>
> >>>>> developing.
> >>>>
> >>>>> The support for Maven is probably also limited. For example, I don't
> >>>>>>
> >>>>> know
> >>>>
> >>>>> if there is a checkstyle plugin for Scala.
> >>>>>>
> >>>>>> I'm looking forward to hearing other opinions on this issue.
As I
> said
> >>>>>>
> >>>>> in
> >>>>
> >>>>> the beginning, we should exchange arguments on this and think about
> it
> >>>>>>
> >>>>> for
> >>>>>
> >>>>>> some time before we decide on this.
> >>>>>>
> >>>>>>  Best,
> >>>>>
> >>>>>> Robert
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Aug 28, 2014 at 1:08 AM, Till Rohrmann <
> trohrmann@apache.org>
> >>>>>> wrote:
> >>>>>>
> >>>>>>  Hi guys,
> >>>>>>>
> >>>>>>> I currently working on replacing the old rpc infrastructure
with an
> >>>>>>>
> >>>>>> akka
> >>>>>
> >>>>>> based actor system. In the wake of this change I will reimplement
> the
> >>>>>>> JobManager and TaskManager which will then be actors. Akka
offers a
> >>>>>>>
> >>>>>> Java
> >>>>>
> >>>>>> API but the implementation turns out to be very verbose and
> >>>>>>>
> >>>>>> laborious,
> >>>>
> >>>>> because Java 6 and 7 do not support lambdas and pattern matching.
> >>>>>>>
> >>>>>> Using
> >>>>
> >>>>> Scala instead, would allow a far more succinct and clear
> >>>>>>>
> >>>>>> implementation
> >>>>
> >>>>> of
> >>>>>>
> >>>>>>> the JobManager and TaskManager. Instead of a lot of if statements
> >>>>>>>
> >>>>>> using
> >>>>
> >>>>> instanceof to figure out the message type, we could simply use
> >>>>>>>
> >>>>>> pattern
> >>>>
> >>>>> matching. Furthermore, the callback functions could simply be Scala's
> >>>>>>> anonymous functions. Therefore I would propose to use Scala
for
> these
> >>>>>>>
> >>>>>> two
> >>>>>
> >>>>>> systems.
> >>>>>>>
> >>>>>>> The Akka system uses the slf4j library as logging interface.
> >>>>>>>
> >>>>>> Therefore
> >>>>
> >>>>> I
> >>>>>
> >>>>>> would also propose to replace the jcl logging system with the
slf4j
> >>>>>>>
> >>>>>> logging
> >>>>>>
> >>>>>>> system. Since we want to use Akka in many parts of the runtime
> system
> >>>>>>>
> >>>>>> and
> >>>>>
> >>>>>> it recommends using logback as logging backend, I would also
like to
> >>>>>>> replace log4j with logback. But this change should inflict
only few
> >>>>>>>
> >>>>>> changes
> >>>>>>
> >>>>>>> once we established the slf4j logging interface everywhere.
> >>>>>>>
> >>>>>>> What do you guys think of that idea?
> >>>>>>>
> >>>>>>> Best regards,
> >>>>>>>
> >>>>>>> Till
> >>>>>>>
> >>>>>>>
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message