flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kostas Tzoumas <ktzou...@apache.org>
Subject Re: Replacing JobManager with Scala implementation
Date Wed, 03 Sep 2014 20:39:32 GMT
Should be a separate thread with [VOTE] in the subject line, a clear
description of what we are voting for, and the duration of the vote
(typically 72 hours).



On Wed, Sep 3, 2014 at 10:14 PM, Till Rohrmann <till.rohrmann@gmail.com>
wrote:

> How do we then start the vote on whether we should implement the JobManager
> with Scala or not? Can we just do it in this thread or should it happen in
> a separate thread?
>
>
> On Wed, Sep 3, 2014 at 6:27 PM, Henry Saputra <henry.saputra@gmail.com>
> wrote:
>
> > Thanks @Ufuk for the response.
> >
> > Yeah, Akka hides all the low level nuts and bolts about the RPC flow
> > but then it also makes a bit harder to debug issues when communication
> > fail.
> > It makes sense to use one RPC framework if we could, and since there
> > are other plans for Akka in the code to help manage concurrencies
> > programming it is good idea to use Akka for RPC.
> >
> > - Henry
> >
> >
> > On Wed, Sep 3, 2014 at 5:06 AM, Ufuk Celebi <uce@apache.org> wrote:
> > > Hey Till,
> > >
> > > I'm not sure what the "right" ASF process is, but I wouldn't mind a
> vote
> > on
> > > this in order to make sure that you don't do unnecessary work by
> > replacing
> > > the code with Scala.
> > >
> > > I for one would be certainly open to it. The only thing that bothers me
> > is
> > > the current state of out-of-the-box IDE support. But since there are
> > other
> > > successful Scala projects around ;-), which manage to do it, why
> > shouldn't
> > > we?
> > >
> > > @Henry, regarding Akka: I think the main motiviation for moving to Akka
> > > (besides the points raised by Stephan and others) is that we actually
> > don't
> > > want to bother with low-level thread management, protocols, etc.
> > >
> > >
> > >
> > > On Tue, Sep 2, 2014 at 8:32 PM, Henry Saputra <henry.saputra@gmail.com
> >
> > > wrote:
> > >
> > >> HI Till,
> > >>
> > >> Thanks for opening the discussions and lead the effort and apologize
> > >> for late response.
> > >>
> > >> From what I have gathered so far, there are 2 issues:
> > >> 1. Introducing Akka as RPC
> > >> 2. Moving to Scala to enable easy access to Akka Scala APIs.
> > >>
> > >> For no1, if the RPC us used for lower level communications then we
> > >> could probably consider Netty as the transport and serialization
> > >> protocol (I also have added comment to the JIRA).
> > >> Internally, to reduce thread management we could use Akka via Scala
> > >> bridge service to make sure we use Scala Akka APIs.
> > >>
> > >> So addressing no 2, we could mix both Scala and Java in JobManager and
> > >> TaskManager. the code that handle async RPC communications between JM
> > >> and TM are using Java via Netty, and internal multi-threads or higher
> > >> level plane code such as heart beat we could use Akka.
> > >>
> > >> It does introduce a bit mix between Java and Scala code but we already
> > >> have mix of Scala and Java to support APIs so I think we could move
> > >> some the internal code to use Scala too as "learning" steps to utilize
> > >> Scala for better multi concurrency/ functional programming.
> > >>
> > >> - Henry
> > >>
> > >>
> > >>
> > >> On Sun, Aug 31, 2014 at 4:31 AM, Till Rohrmann <
> till.rohrmann@gmail.com
> > >
> > >> wrote:
> > >> > Hi Daniel,
> > >> >
> > >> > the RPC rework is discussed in
> > >> > https://issues.apache.org/jira/browse/FLINK-1019. Jira is currently
> > down
> > >> > due to maintenance reasons.
> > >> >
> > >> > The ideas to use akka are the following. Akka allows us to reduce
> the
> > >> code
> > >> > base which has to be maintained. Especially, we get rid of all the
> > >> > multi-threading programming of the rpc service which is always hard
> to
> > >> work
> > >> > with. With Akka we would get the heartbeat signal for free, because
> > Akka
> > >> > can detect dead actors. Akka uses supervision to handle fault
> > tolerance
> > >> as
> > >> > well as recovery and it allows an easy forwarding of remote
> > exceptions.
> > >> At
> > >> > the same time it offers a nice rpc abstraction which easily allows
> to
> > >> > implement asynchronous services. Furthermore, it scales rather well
> to
> > >> > large numbers of nodes and hopefully we get the latencies of Flink
a
> > >> little
> > >> > bit down.
> > >> >
> > >> > Bests,
> > >> >
> > >> > Till
> > >> >
> > >> >
> > >> > On Sun, Aug 31, 2014 at 11:35 AM, Daniel Warneke <
> warneke@apache.org>
> > >> wrote:
> > >> >
> > >> >> Hi,
> > >> >>
> > >> >> will akka just be used for RPC or are there any plans to expand
the
> > >> >> actor-based model to further parts of the runtime system? If so,
> > could
> > >> you
> > >> >> please point me to the discussion thread?
> > >> >>
> > >> >> Spontaneously, I would say that adding a hard dependency on Scala
> > just
> > >> for
> > >> >> the sake of having a hip RPC service sounds like a pretty dodgy
> deal.
> > >> >> Therefore, I would like understand how much value akka could bring
> to
> > >> Flink
> > >> >> in the long run. The discussion whether to reimplement core
> > components
> > >> of
> > >> >> the system in Scala should be the second step in my opinion.
> > >> >>
> > >> >> Bests,
> > >> >>
> > >> >>     Daniel
> > >> >>
> > >> >>
> > >> >> Am 29.08.2014 11:33, schrieb Asterios Katsifodimos:
> > >> >>
> > >> >>  I agree that using Akka's actors from Java results in very ugly
> > code.
> > >> >>> Hiding the internals of Akka behind Java reflection looks
better
> but
> > >> >>> breaks
> > >> >>> the principles of actors. For me it is kind of a deal breaker
for
> > using
> > >> >>> Akka from Java.  I think that Till has more reasons to believe
> that
> > >> Scala
> > >> >>> would be a more appropriate for building a new Job/Task Manager.
> > >> >>>
> > >> >>> I think that this discussion should focus on 4 main aspects:
> > >> >>> 1. Performance
> > >> >>> 2. Implementability
> > >> >>> 3. Maintainability
> > >> >>> 4. Available Tools
> > >> >>>
> > >> >>> 1. Performance: Since that the job of the JobManager and the
> > >> TaskManager
> > >> >>> is
> > >> >>> to 1) exchange messages in order to maintain a distributed
state
> > >> machine
> > >> >>> and 2) setup connections between task managers, 3) detect
failures
> > >> etc..
> > >> >>> In
> > >> >>> these basic operations, performance should not be an issue.
Akka
> was
> > >> >>> proven
> > >> >>> to scale quite well with very low latency. I guess that the
low
> > level
> > >> >>> "plumbing" (serialization, connections, etc.) will continue
in
> Java
> > in
> > >> >>> order to guarantee high performance. I have no clue on what's
> > happening
> > >> >>> with memory management and whether this will be implemented
in
> Java
> > or
> > >> >>> Scala and the respective consequences. Please comment.
> > >> >>>
> > >> >>> 2. Since the Job/Task Manager is going to be essentially
> implemented
> > >> from
> > >> >>> scratch, given the power of Akka, it seems to me that the
> > >> implementation
> > >> >>> will be   easier, shorter and less verbose in Scala, given
that
> > Till is
> > >> >>> comfortable enough with Scala.
> > >> >>>
> > >> >>> 3. Given #2, maintaining the code and trying out new ideas
in
> Scala
> > >> would
> > >> >>> take less time and effort. But maintaining low level plumbing
in
> > Java
> > >> and
> > >> >>> high level logic in Scala scares me. Anyone that has done
this
> > before
> > >> >>> could
> > >> >>> comment on this?
> > >> >>>
> > >> >>> 4. Tools: Robert has raised some issues already but I think
that
> > tools
> > >> >>> will
> > >> >>> get better with time.
> > >> >>>
> > >> >>> Given the above, I would focus on #3 to be honest. Apart from
> this,
> > >> going
> > >> >>> the Scala way sounds like a great idea. I really second Kostas'
> > opinion
> > >> >>> that if large changes are going to happen, this is the best
> moment.
> > >> >>>
> > >> >>> Cheers,
> > >> >>> Asterios
> > >> >>>
> > >> >>>
> > >> >>>
> > >> >>> On Fri, Aug 29, 2014 at 1:02 AM, Till Rohrmann <
> > >> till.rohrmann@gmail.com>
> > >> >>> wrote:
> > >> >>>
> > >> >>>  I also agree with Robert and Kostas that it has to be a community
> > >> >>>> decision.
> > >> >>>> I understand the problems with Eclipse and the Scala IDE
which
> is a
> > >> pain
> > >> >>>> in
> > >> >>>> the ass. But eventually these things will be fixed. Maybe
we
> could
> > >> also
> > >> >>>> talk to the typesafe guy and tell him that this problem
bothers
> us
> > a
> > >> lot.
> > >> >>>>
> > >> >>>> I also believe that the project is not about a specific
> programming
> > >> >>>> language but a problem we want to tackle with Flink. From
time to
> > >> time it
> > >> >>>> might be necessary to adapt the tools in order to reach
the goal.
> > In
> > >> >>>> fact,
> > >> >>>> I don't believe that Scala parts would drive people away
from the
> > >> >>>> project.
> > >> >>>> Instead, Scala enthusiasts would be motivated to join
us.
> > >> >>>>
> > >> >>>> Actually I stumbled across a quote of Leibniz which put's
my
> point
> > of
> > >> >>>> view
> > >> >>>> quite accurately in a nutshell:
> > >> >>>>
> > >> >>>> In symbols one observes an advantage in discovery which
is
> greatest
> > >> when
> > >> >>>> they express the exact nature of a thing briefly and,
as it were,
> > >> picture
> > >> >>>> it; then indeed the labor of thought is wonderfully diminished
--
> > >> >>>> Gottfried
> > >> >>>> Wilhelm Leibniz
> > >> >>>>
> > >> >>>>
> > >> >>>> On Thu, Aug 28, 2014 at 12:57 PM, Kostas Tzoumas <
> > ktzoumas@apache.org
> > >> >
> > >> >>>> wrote:
> > >> >>>>
> > >> >>>>  On Thu, Aug 28, 2014 at 11:49 AM, Robert Metzger <
> > >> rmetzger@apache.org>
> > >> >>>>> wrote:
> > >> >>>>>
> > >> >>>>>  Changing the programming language of a very important
system
> > >> component
> > >> >>>>>>
> > >> >>>>> is
> > >> >>>>
> > >> >>>>> something we should carefully discuss.
> > >> >>>>>>
> > >> >>>>>>  Definitely agree, I think the community should
discuss this
> very
> > >> >>>>>
> > >> >>>> carefully.
> > >> >>>>
> > >> >>>>>
> > >> >>>>>  I understand that Akka is written in Scala and that
it will be
> > much
> > >> >>>>>>
> > >> >>>>> more
> > >> >>>>
> > >> >>>>> natural to implement the actor based system using
Scala.
> > >> >>>>>> I see the following issues that we should consider:
> > >> >>>>>> Until now, Flink is clearly a project implemented
only in Java.
> > The
> > >> >>>>>>
> > >> >>>>> Scala
> > >> >>>>
> > >> >>>>> API basically sits on top of the Java-based runtime.
We do not
> > really
> > >> >>>>>> depend on Scala (we could easily remove the Scala
API if we
> want
> > >> to).
> > >> >>>>>> Having code written in Scala in the main system
will add a hard
> > >> >>>>>>
> > >> >>>>> dependency
> > >> >>>>>
> > >> >>>>>> on a scala version.
> > >> >>>>>> Being a pure Java project has some advantages:
I think its a
> fact
> > >> that
> > >> >>>>>> there are more Java programmers than Scala programmers.
So our
> > >> chances
> > >> >>>>>>
> > >> >>>>> of
> > >> >>>>
> > >> >>>>> attracting new contributors are higher when being
a Java
> project.
> > >> >>>>>> On the other hand, we could maybe attract Scala
developers to
> our
> > >> >>>>>>
> > >> >>>>> project.
> > >> >>>>>
> > >> >>>>>> But that has not happened (for contributors, not
users!) so far
> > for
> > >> our
> > >> >>>>>> Scala API, so I don't see any reason for that
to happen.
> > >> >>>>>>
> > >> >>>>>>
> > >> >>>>>>  This is definitely an issue to consider. We need
to carefully
> > >> weight
> > >> >>>>> how
> > >> >>>>> important this issue is. If we want to break things,
incubation
> is
> > >> the
> > >> >>>>> right time to do it. Below are some arguments in favor
of
> breaking
> > >> >>>>>
> > >> >>>> things,
> > >> >>>>
> > >> >>>>> but do keep in mind that I am undecided, and I would
really like
> > to
> > >> see
> > >> >>>>>
> > >> >>>> the
> > >> >>>>
> > >> >>>>> community weighing in.
> > >> >>>>>
> > >> >>>>> First, I would dare say that the primary reason for
someone to
> > >> >>>>> contribute
> > >> >>>>> to Flink so far has not been that the code is written
in Java,
> but
> > >> more
> > >> >>>>>
> > >> >>>> the
> > >> >>>>
> > >> >>>>> content and nature of the project. Most contributors
are Big
> Data
> > >> >>>>> enthusiasts in some way or another.
> > >> >>>>>
> > >> >>>>> Second, Scala projects have attracted contributors
in the past.
> > >> >>>>>
> > >> >>>>> Third, it should not be too hard for someone that
does not know
> > >> Scala to
> > >> >>>>> contribute to a different component if the interfaces
are clear.
> > >> >>>>>
> > >> >>>>>
> > >> >>>>>  Another issue is tooling: There are a lot of problems
with
> Scala
> > and
> > >> >>>>>> Eclipse: I've recently switched to Eclipse Luna.
It seems to be
> > >> >>>>>>
> > >> >>>>> impossible
> > >> >>>>>
> > >> >>>>>> to compile Scala code with Luna because ScalaIDE
does not
> > properly
> > >> cope
> > >> >>>>>> with it.
> > >> >>>>>> Even with Eclipse versions that are supported
by ScalaIDE, you
> > have
> > >> to
> > >> >>>>>> manually install 3 plugins, some of them are not
available in
> the
> > >> >>>>>>
> > >> >>>>> Eclipse
> > >> >>>>
> > >> >>>>> Marketplace. So with a JobManager written in Scala,
users can
> not
> > >> just
> > >> >>>>>> import our project as a Maven project into Eclipse
and start
> > >> >>>>>>
> > >> >>>>> developing.
> > >> >>>>
> > >> >>>>> The support for Maven is probably also limited. For
example, I
> > don't
> > >> >>>>>>
> > >> >>>>> know
> > >> >>>>
> > >> >>>>> if there is a checkstyle plugin for Scala.
> > >> >>>>>>
> > >> >>>>>> I'm looking forward to hearing other opinions
on this issue.
> As I
> > >> said
> > >> >>>>>>
> > >> >>>>> in
> > >> >>>>
> > >> >>>>> the beginning, we should exchange arguments on this
and think
> > about
> > >> it
> > >> >>>>>>
> > >> >>>>> for
> > >> >>>>>
> > >> >>>>>> some time before we decide on this.
> > >> >>>>>>
> > >> >>>>>>  Best,
> > >> >>>>>
> > >> >>>>>> Robert
> > >> >>>>>>
> > >> >>>>>>
> > >> >>>>>>
> > >> >>>>>> On Thu, Aug 28, 2014 at 1:08 AM, Till Rohrmann
<
> > >> trohrmann@apache.org>
> > >> >>>>>> wrote:
> > >> >>>>>>
> > >> >>>>>>  Hi guys,
> > >> >>>>>>>
> > >> >>>>>>> I currently working on replacing the old rpc
infrastructure
> > with an
> > >> >>>>>>>
> > >> >>>>>> akka
> > >> >>>>>
> > >> >>>>>> based actor system. In the wake of this change
I will
> reimplement
> > >> the
> > >> >>>>>>> JobManager and TaskManager which will then
be actors. Akka
> > offers a
> > >> >>>>>>>
> > >> >>>>>> Java
> > >> >>>>>
> > >> >>>>>> API but the implementation turns out to be very
verbose and
> > >> >>>>>>>
> > >> >>>>>> laborious,
> > >> >>>>
> > >> >>>>> because Java 6 and 7 do not support lambdas and pattern
> matching.
> > >> >>>>>>>
> > >> >>>>>> Using
> > >> >>>>
> > >> >>>>> Scala instead, would allow a far more succinct and
clear
> > >> >>>>>>>
> > >> >>>>>> implementation
> > >> >>>>
> > >> >>>>> of
> > >> >>>>>>
> > >> >>>>>>> the JobManager and TaskManager. Instead of
a lot of if
> > statements
> > >> >>>>>>>
> > >> >>>>>> using
> > >> >>>>
> > >> >>>>> instanceof to figure out the message type, we could
simply use
> > >> >>>>>>>
> > >> >>>>>> pattern
> > >> >>>>
> > >> >>>>> matching. Furthermore, the callback functions could
simply be
> > Scala's
> > >> >>>>>>> anonymous functions. Therefore I would propose
to use Scala
> for
> > >> these
> > >> >>>>>>>
> > >> >>>>>> two
> > >> >>>>>
> > >> >>>>>> systems.
> > >> >>>>>>>
> > >> >>>>>>> The Akka system uses the slf4j library as
logging interface.
> > >> >>>>>>>
> > >> >>>>>> Therefore
> > >> >>>>
> > >> >>>>> I
> > >> >>>>>
> > >> >>>>>> would also propose to replace the jcl logging
system with the
> > slf4j
> > >> >>>>>>>
> > >> >>>>>> logging
> > >> >>>>>>
> > >> >>>>>>> system. Since we want to use Akka in many
parts of the runtime
> > >> system
> > >> >>>>>>>
> > >> >>>>>> and
> > >> >>>>>
> > >> >>>>>> it recommends using logback as logging backend,
I would also
> > like to
> > >> >>>>>>> replace log4j with logback. But this change
should inflict
> only
> > few
> > >> >>>>>>>
> > >> >>>>>> changes
> > >> >>>>>>
> > >> >>>>>>> once we established the slf4j logging interface
everywhere.
> > >> >>>>>>>
> > >> >>>>>>> What do you guys think of that idea?
> > >> >>>>>>>
> > >> >>>>>>> Best regards,
> > >> >>>>>>>
> > >> >>>>>>> Till
> > >> >>>>>>>
> > >> >>>>>>>
> > >> >>
> > >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message