flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Till Rohrmann <trohrm...@apache.org>
Subject Re: Improve the documentation of the Flink Architecture and internals
Date Sat, 21 Mar 2015 16:30:13 GMT
I wrote some internal documentation for Akka and the distributed
communication [1].

Cheers,

Till

[1] https://cwiki.apache.org/confluence/display/FLINK/Akka+and+Actors

On Fri, Mar 20, 2015 at 7:31 PM, Henry Saputra <henry.saputra@gmail.com>
wrote:

> Ah the Tweet infra bot just announce extended downtime for Confluence [1]
>
> - Henry
>
> [1] https://twitter.com/infrabot/status/578983473970475008
>
> On Fri, Mar 20, 2015 at 11:27 AM, Stephan Ewen <sewen@apache.org> wrote:
> > For me as well. Earlier today it said "down for maintenance"
> >
> > On Fri, Mar 20, 2015 at 7:14 PM, Kostas Tzoumas <ktzoumas@apache.org>
> wrote:
> >
> >> it's down for me as well
> >>
> >> On Fri, Mar 20, 2015 at 7:12 PM, Henry Saputra <henry.saputra@gmail.com
> >
> >> wrote:
> >>
> >> > Is the wiki down for any of you?
> >> >
> >> > I can't access
> >> > https://cwiki.apache.org/confluence/display/FLINK/Apache+Flink+Home
> >> >
> >> > 404
> >> >
> >> > - Henry
> >> >
> >> > On Fri, Mar 20, 2015 at 4:46 AM, Kostas Tzoumas <ktzoumas@apache.org>
> >> > wrote:
> >> > > I added a document for data exchange between tasks:
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/FLINK/Data+exchange+between+tasks
> >> > >
> >> > > Feel free to edit. I plan to link the class names to the class
> files in
> >> > > github.
> >> > >
> >> > > On Tue, Mar 17, 2015 at 11:17 AM, Kostas Tzoumas <
> ktzoumas@apache.org>
> >> > > wrote:
> >> > >
> >> > >> +1 for the Wiki.
> >> > >>
> >> > >> When these have been stabilized we can move them to the docs if
we
> >> > decide
> >> > >> to do so.
> >> > >>
> >> > >> On Mon, Mar 16, 2015 at 10:07 PM, Stephan Ewen <sewen@apache.org>
> >> > wrote:
> >> > >>
> >> > >>> I have put my suggested version of an outline for the docs
into
> the
> >> > wiki.
> >> > >>> Regardless where the docs end up (wiki or repository), we
can use
> the
> >> > wiki
> >> > >>> to outline the docs.
> >> > >>>
> >> > >>> https://cwiki.apache.org/confluence/display/FLINK/Flink+Internals
> >> > >>>
> >> > >>> Some pages contain some stub or outline, others are completely
> blank.
> >> > >>>
> >> > >>> Not a comple list. Additions are welcome.
> >> > >>>
> >> > >>> On Mon, Mar 16, 2015 at 10:04 PM, Stephan Ewen <sewen@apache.org>
> >> > wrote:
> >> > >>>
> >> > >>> > I think the Wiki has a much lower barrier of entry to
fix docs,
> >> > >>> especially
> >> > >>> > for external people. The docs, with the Jekyll setup,
is rather
> >> > tricky.
> >> > >>> > I would very much like that all kinds of people contribute
to
> the
> >> > docs
> >> > >>> > about the internals, not just the usual three suspects
that have
> >> done
> >> > >>> this
> >> > >>> > so far.
> >> > >>> >
> >> > >>> > Having a good landing page in the regular docs is exactly
to not
> >> > loose
> >> > >>> all
> >> > >>> > the people that do not look into a wiki. The overview
pages for
> the
> >> > >>> > internals need to be good and accessible and nicely link
to the
> >> wiki
> >> > to
> >> > >>> > "forward" people there.
> >> > >>> >
> >> > >>> > The overhead of deciding what goes where should not be
terribly
> >> > large,
> >> > >>> in
> >> > >>> > my opinion, since there is no really "wrong" place to
put it.
> >> > >>> >
> >> > >>> >
> >> > >>> >
> >> > >>> > On Mon, Mar 16, 2015 at 9:58 PM, Aljoscha Krettek <
> >> > aljoscha@apache.org>
> >> > >>> > wrote:
> >> > >>> >
> >> > >>> >> Why do you wan't to split stuff between the doc in
the
> repository
> >> > and
> >> > >>> >> the wiki. I for one would always be to lazy to check
stuff in a
> >> wiki
> >> > >>> >> when there is also a documentation. Plus, this would
lead to
> >> > >>> >> additional overhead in deciding what goes where and
syncing
> >> between
> >> > >>> >> the two places for documentation.
> >> > >>> >>
> >> > >>> >> On Mon, Mar 16, 2015 at 7:59 PM, Stephan Ewen <
> sewen@apache.org>
> >> > >>> wrote:
> >> > >>> >> > Ah, I totally forgot to add to the internals:
> >> > >>> >> >
> >> > >>> >> >   - Fault tolerance in Batch mode
> >> > >>> >> >
> >> > >>> >> >   - Fault Tolerance in Streaming Mode, with
state handling
> >> > >>> >> >
> >> > >>> >> > On Mon, Mar 16, 2015 at 7:51 PM, Stephan Ewen
<
> sewen@apache.org
> >> >
> >> > >>> wrote:
> >> > >>> >> >
> >> > >>> >> >> Hi all!
> >> > >>> >> >>
> >> > >>> >> >> I would like to kick of an effort to improve
the
> documentation
> >> of
> >> > >>> the
> >> > >>> >> >> Flink Architecture and internals. This also
means making the
> >> > >>> streaming
> >> > >>> >> >> architecture more prominent in the docs.
> >> > >>> >> >>
> >> > >>> >> >> Being quite a sophisticated stack, we need
to improve the
> >> > >>> presentation
> >> > >>> >> of
> >> > >>> >> >> how Flink works - to an extend necessary
to use Flink (and
> to
> >> > >>> >> appreciate
> >> > >>> >> >> all the cool stuff that is happening). This
should also
> come in
> >> > >>> handy
> >> > >>> >> with
> >> > >>> >> >> new contributors.
> >> > >>> >> >>
> >> > >>> >> >> As a general umbrella, we need to first
decide where and
> how to
> >> > >>> >> organize
> >> > >>> >> >> the documentation.
> >> > >>> >> >>
> >> > >>> >> >> I would propose to put the bulk of the documentation
into
> the
> >> > Wiki.
> >> > >>> >> Create
> >> > >>> >> >> a dedicated section on Flink Internals and
sub-pages for
> each
> >> > >>> >> component /
> >> > >>> >> >> topic. To the docs, we add a general overview
from which we
> >> link
> >> > >>> into
> >> > >>> >> the
> >> > >>> >> >> Wiki.
> >> > >>> >> >>
> >> > >>> >> >>
> >> > >>> >> >>  == These sections would go into the DOCS
in the git
> repository
> >> > ==
> >> > >>> >> >>
> >> > >>> >> >>   - Overview of Program, pre-flight phase
(type extraction,
> >> > >>> optimizer),
> >> > >>> >> >> JobManager, TaskManager. Differences between
streaming and
> >> > batch. We
> >> > >>> >> can
> >> > >>> >> >> realize this through one very nice picture
with few lines of
> >> > text.
> >> > >>> >> >>
> >> > >>> >> >>   - High level architecture stack, different
program
> >> > representations
> >> > >>> >> (API
> >> > >>> >> >> operators, common API DAG, optimizer DAG,
parallel data flow
> >> > >>> (JobGraph
> >> > >>> >> /
> >> > >>> >> >> Execution Graph)
> >> > >>> >> >>
> >> > >>> >> >>   - (maybe) Parallelism and scheduling.
This seems to be
> >> > paramount
> >> > >>> to
> >> > >>> >> >> understand for users.
> >> > >>> >> >>
> >> > >>> >> >>   - Processes (JobManager, TaskManager,
Webserver,
> WebClient,
> >> CLI
> >> > >>> >> client)
> >> > >>> >> >>
> >> > >>> >> >>
> >> > >>> >> >>
> >> > >>> >> >>  == These sections would go into the WIKI
==
> >> > >>> >> >>
> >> > >>> >> >>   - Project structure (maven projects, what
is where,
> >> > dependencies
> >> > >>> >> between
> >> > >>> >> >> projects)
> >> > >>> >> >>
> >> > >>> >> >>   - Component overview
> >> > >>> >> >>
> >> > >>> >> >>     -> JobManager (InstanceManager, Scheduler,
BLOB server,
> >> > Library
> >> > >>> >> Cache,
> >> > >>> >> >> Archiving)
> >> > >>> >> >>
> >> > >>> >> >>     -> TaskManager (MemoryManager, IOManager,
BLOB Cache,
> >> Library
> >> > >>> >> Cache)
> >> > >>> >> >>
> >> > >>> >> >>     -> Involved Actor Systems / Actors
/ Messages
> >> > >>> >> >>
> >> > >>> >> >>   - Details about submitting a job (library
upload, job
> graph
> >> > >>> >> submission,
> >> > >>> >> >> execution graph setup, scheduling trigger)
> >> > >>> >> >>
> >> > >>> >> >>   - Memory Management
> >> > >>> >> >>
> >> > >>> >> >>   - Optimizer internals
> >> > >>> >> >>
> >> > >>> >> >>   - Akka Setup specifics
> >> > >>> >> >>
> >> > >>> >> >>   - Netty and pluggable data exchange strategies
> >> > >>> >> >>
> >> > >>> >> >>   - Testing: Flink test clusters and unit
test utilities
> >> > >>> >> >>
> >> > >>> >> >>   - Developer How-To: Setting up Eclipse,
IntelliJ, Travis
> >> > >>> >> >>
> >> > >>> >> >>   - Step-by-step guide to add a new operator
> >> > >>> >> >>
> >> > >>> >> >>
> >> > >>> >> >> I will go ahead and stub some sections in
the Wiki.
> >> > >>> >> >>
> >> > >>> >> >> As we discuss and agree/disagree with the
outline, we can
> >> evolve
> >> > the
> >> > >>> >> Wiki.
> >> > >>> >> >>
> >> > >>> >> >> Greetings,
> >> > >>> >> >> Stephan
> >> > >>> >> >>
> >> > >>> >> >>
> >> > >>> >>
> >> > >>> >
> >> > >>> >
> >> > >>>
> >> > >>
> >> > >>
> >> >
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message