flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: Improve the documentation of the Flink Architecture and internals
Date Fri, 20 Mar 2015 18:27:56 GMT
For me as well. Earlier today it said "down for maintenance"

On Fri, Mar 20, 2015 at 7:14 PM, Kostas Tzoumas <ktzoumas@apache.org> wrote:

> it's down for me as well
>
> On Fri, Mar 20, 2015 at 7:12 PM, Henry Saputra <henry.saputra@gmail.com>
> wrote:
>
> > Is the wiki down for any of you?
> >
> > I can't access
> > https://cwiki.apache.org/confluence/display/FLINK/Apache+Flink+Home
> >
> > 404
> >
> > - Henry
> >
> > On Fri, Mar 20, 2015 at 4:46 AM, Kostas Tzoumas <ktzoumas@apache.org>
> > wrote:
> > > I added a document for data exchange between tasks:
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/Data+exchange+between+tasks
> > >
> > > Feel free to edit. I plan to link the class names to the class files in
> > > github.
> > >
> > > On Tue, Mar 17, 2015 at 11:17 AM, Kostas Tzoumas <ktzoumas@apache.org>
> > > wrote:
> > >
> > >> +1 for the Wiki.
> > >>
> > >> When these have been stabilized we can move them to the docs if we
> > decide
> > >> to do so.
> > >>
> > >> On Mon, Mar 16, 2015 at 10:07 PM, Stephan Ewen <sewen@apache.org>
> > wrote:
> > >>
> > >>> I have put my suggested version of an outline for the docs into the
> > wiki.
> > >>> Regardless where the docs end up (wiki or repository), we can use the
> > wiki
> > >>> to outline the docs.
> > >>>
> > >>> https://cwiki.apache.org/confluence/display/FLINK/Flink+Internals
> > >>>
> > >>> Some pages contain some stub or outline, others are completely blank.
> > >>>
> > >>> Not a comple list. Additions are welcome.
> > >>>
> > >>> On Mon, Mar 16, 2015 at 10:04 PM, Stephan Ewen <sewen@apache.org>
> > wrote:
> > >>>
> > >>> > I think the Wiki has a much lower barrier of entry to fix docs,
> > >>> especially
> > >>> > for external people. The docs, with the Jekyll setup, is rather
> > tricky.
> > >>> > I would very much like that all kinds of people contribute to
the
> > docs
> > >>> > about the internals, not just the usual three suspects that have
> done
> > >>> this
> > >>> > so far.
> > >>> >
> > >>> > Having a good landing page in the regular docs is exactly to not
> > loose
> > >>> all
> > >>> > the people that do not look into a wiki. The overview pages for
the
> > >>> > internals need to be good and accessible and nicely link to the
> wiki
> > to
> > >>> > "forward" people there.
> > >>> >
> > >>> > The overhead of deciding what goes where should not be terribly
> > large,
> > >>> in
> > >>> > my opinion, since there is no really "wrong" place to put it.
> > >>> >
> > >>> >
> > >>> >
> > >>> > On Mon, Mar 16, 2015 at 9:58 PM, Aljoscha Krettek <
> > aljoscha@apache.org>
> > >>> > wrote:
> > >>> >
> > >>> >> Why do you wan't to split stuff between the doc in the repository
> > and
> > >>> >> the wiki. I for one would always be to lazy to check stuff
in a
> wiki
> > >>> >> when there is also a documentation. Plus, this would lead
to
> > >>> >> additional overhead in deciding what goes where and syncing
> between
> > >>> >> the two places for documentation.
> > >>> >>
> > >>> >> On Mon, Mar 16, 2015 at 7:59 PM, Stephan Ewen <sewen@apache.org>
> > >>> wrote:
> > >>> >> > Ah, I totally forgot to add to the internals:
> > >>> >> >
> > >>> >> >   - Fault tolerance in Batch mode
> > >>> >> >
> > >>> >> >   - Fault Tolerance in Streaming Mode, with state handling
> > >>> >> >
> > >>> >> > On Mon, Mar 16, 2015 at 7:51 PM, Stephan Ewen <sewen@apache.org
> >
> > >>> wrote:
> > >>> >> >
> > >>> >> >> Hi all!
> > >>> >> >>
> > >>> >> >> I would like to kick of an effort to improve the
documentation
> of
> > >>> the
> > >>> >> >> Flink Architecture and internals. This also means
making the
> > >>> streaming
> > >>> >> >> architecture more prominent in the docs.
> > >>> >> >>
> > >>> >> >> Being quite a sophisticated stack, we need to improve
the
> > >>> presentation
> > >>> >> of
> > >>> >> >> how Flink works - to an extend necessary to use Flink
(and to
> > >>> >> appreciate
> > >>> >> >> all the cool stuff that is happening). This should
also come in
> > >>> handy
> > >>> >> with
> > >>> >> >> new contributors.
> > >>> >> >>
> > >>> >> >> As a general umbrella, we need to first decide where
and how to
> > >>> >> organize
> > >>> >> >> the documentation.
> > >>> >> >>
> > >>> >> >> I would propose to put the bulk of the documentation
into the
> > Wiki.
> > >>> >> Create
> > >>> >> >> a dedicated section on Flink Internals and sub-pages
for each
> > >>> >> component /
> > >>> >> >> topic. To the docs, we add a general overview from
which we
> link
> > >>> into
> > >>> >> the
> > >>> >> >> Wiki.
> > >>> >> >>
> > >>> >> >>
> > >>> >> >>  == These sections would go into the DOCS in the
git repository
> > ==
> > >>> >> >>
> > >>> >> >>   - Overview of Program, pre-flight phase (type extraction,
> > >>> optimizer),
> > >>> >> >> JobManager, TaskManager. Differences between streaming
and
> > batch. We
> > >>> >> can
> > >>> >> >> realize this through one very nice picture with few
lines of
> > text.
> > >>> >> >>
> > >>> >> >>   - High level architecture stack, different program
> > representations
> > >>> >> (API
> > >>> >> >> operators, common API DAG, optimizer DAG, parallel
data flow
> > >>> (JobGraph
> > >>> >> /
> > >>> >> >> Execution Graph)
> > >>> >> >>
> > >>> >> >>   - (maybe) Parallelism and scheduling. This seems
to be
> > paramount
> > >>> to
> > >>> >> >> understand for users.
> > >>> >> >>
> > >>> >> >>   - Processes (JobManager, TaskManager, Webserver,
WebClient,
> CLI
> > >>> >> client)
> > >>> >> >>
> > >>> >> >>
> > >>> >> >>
> > >>> >> >>  == These sections would go into the WIKI ==
> > >>> >> >>
> > >>> >> >>   - Project structure (maven projects, what is where,
> > dependencies
> > >>> >> between
> > >>> >> >> projects)
> > >>> >> >>
> > >>> >> >>   - Component overview
> > >>> >> >>
> > >>> >> >>     -> JobManager (InstanceManager, Scheduler,
BLOB server,
> > Library
> > >>> >> Cache,
> > >>> >> >> Archiving)
> > >>> >> >>
> > >>> >> >>     -> TaskManager (MemoryManager, IOManager,
BLOB Cache,
> Library
> > >>> >> Cache)
> > >>> >> >>
> > >>> >> >>     -> Involved Actor Systems / Actors / Messages
> > >>> >> >>
> > >>> >> >>   - Details about submitting a job (library upload,
job graph
> > >>> >> submission,
> > >>> >> >> execution graph setup, scheduling trigger)
> > >>> >> >>
> > >>> >> >>   - Memory Management
> > >>> >> >>
> > >>> >> >>   - Optimizer internals
> > >>> >> >>
> > >>> >> >>   - Akka Setup specifics
> > >>> >> >>
> > >>> >> >>   - Netty and pluggable data exchange strategies
> > >>> >> >>
> > >>> >> >>   - Testing: Flink test clusters and unit test utilities
> > >>> >> >>
> > >>> >> >>   - Developer How-To: Setting up Eclipse, IntelliJ,
Travis
> > >>> >> >>
> > >>> >> >>   - Step-by-step guide to add a new operator
> > >>> >> >>
> > >>> >> >>
> > >>> >> >> I will go ahead and stub some sections in the Wiki.
> > >>> >> >>
> > >>> >> >> As we discuss and agree/disagree with the outline,
we can
> evolve
> > the
> > >>> >> Wiki.
> > >>> >> >>
> > >>> >> >> Greetings,
> > >>> >> >> Stephan
> > >>> >> >>
> > >>> >> >>
> > >>> >>
> > >>> >
> > >>> >
> > >>>
> > >>
> > >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message