flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gyula Fóra <gyf...@apache.org>
Subject Re: [DISCUSS] Dedicated streaming mode
Date Thu, 21 May 2015 20:30:11 GMT
Huge +1 from my side :)

Sorry for the late response.

On Thu, May 21, 2015 at 9:54 PM, Aljoscha Krettek <aljoscha@apache.org>
wrote:

> This sounds very reasonable.
> On May 21, 2015 9:34 PM, "Stephan Ewen" <sewen@apache.org> wrote:
>
> > I discussed a bit via Skype with Gyula and Paris.
> >
> >
> > We thought about the following way to do it:
> >
> >  - We add a dedicated streaming mode for now. The streaming mode
> supersedes
> > the batch mode, so it can run both type of programs.
> >
> >  - The streaming mode sets the memory manager to "lazy allocation".
> >     -> So long as it runs pure streaming jobs, the full heap will be
> > available to window buffers and UDFs.
> >     -> Batch programs can still run, so mixed workloads are not
> prevented.
> > Batch programs are a bit less robust there, because the memory manager
> does
> > not pre-allocate memory. UDFs can eat into Flink's memory portion.
> >
> >  - The streaming mode starts the necessary configured components/services
> > for state backups
> >
> >
> >
> > Over the next versions, we want to bring these things together:
> >   - use the managed memory for window buffers
> >   - on-demand starting of the state backend
> >
> > Then, we deprecate the streaming mode, let both modes start the cluster
> in
> > the same way.
> >
> >
> >
> >
> >
> > On Thu, May 21, 2015 at 4:01 PM, Aljoscha Krettek <aljoscha@apache.org>
> > wrote:
> >
> > > Would it not be possible to start the snapshot service once the user
> > > starts the first streaming job? About 2) with checkpointing coming up,
> > > would it not make sense to shift to managed memory rather sooner than
> > > later. Then this point would become moot.
> > >
> > > On Thu, May 21, 2015 at 3:47 PM, Matthias J. Sax
> > > <mjsax@informatik.hu-berlin.de> wrote:
> > > > What would be the consequences on "mixed" programs? (If there is any
> > > > plan to support those?)
> > > >
> > > > Would it be necessary to have a third mode? Or would those programs
> > > > simple run in streaming mode?
> > > >
> > > > -Matthias
> > > >
> > > > On 05/21/2015 03:12 PM, Stephan Ewen wrote:
> > > >> Hi all!
> > > >>
> > > >> We discussed a while back about introducing a dedicated streaming
> mode
> > > for
> > > >> Flink. I would like to take a go at this and implement the changes,
> > but
> > > >> discuss them before.
> > > >>
> > > >>
> > > >> Here is a brief summary why we wanted to introduce the dedicated
> > > streaming
> > > >> mode:
> > > >> Even though both batch and streaming are executed by the same
> > execution
> > > >> engine,
> > > >> a streaming setup of Flink varies a bit from a batch setup:
> > > >>
> > > >> 1) The streaming cluster starts an additional service to store the
> > > >> distributed state snapshots.
> > > >>
> > > >> 2) Streaming mode uses memory a bit different, so we should
> configure
> > > the
> > > >> memory manager differently. This difference may eventually go away.
> > > >>
> > > >>
> > > >>
> > > >> Concretely, to implement this, I was thinking about introducing the
> > > >> following externally visible changes
> > > >>
> > > >>  - Additional scripts "start-streaming-cluster.sh" and
> > > >> "start-streaming-local.sh"
> > > >>
> > > >>  - An execution mode parameter for the TaskManager ("batch /
> > streaming")
> > > >>
> > > >>  - An execution mode parameter for the JobManager TaskManager
> ("batch
> > /
> > > >> streaming")
> > > >>
> > > >>  - All local executors and mini clusters need a flag that specifies
> > > whether
> > > >> they will start
> > > >>    a streaming cluster, or a pure batch cluster.
> > > >>
> > > >>
> > > >> Anything else that comes to your minds?
> > > >>
> > > >>
> > > >> Greetings,
> > > >> Stephan
> > > >>
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message