flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Metzger <rmetz...@apache.org>
Subject Re: [Discuss] Organizing Documentation for Configuration Options
Date Tue, 07 Feb 2017 10:13:14 GMT
+1 to automate this and describe the config parameters in the code.

That's exactly the approach Apache Kafka is taking as well for their config.

On Tue, Feb 7, 2017 at 9:04 AM, Ufuk Celebi <uce@apache.org> wrote:

> I fully agree with you Greg.
>
> Since this is doomed to get out of sync again very shortly after clean up,
> I vote to automate this. Stephan introduced the ConfigOption type, which
> makes it easy to define the options. It's already planned to migrate all
> configuration options from ConfigConstants to this approach.
>
> For an example see here: https://github.com/apache/flink/blob/master/
> flink-core/src/main/java/org/apache/flink/configuration/
> HighAvailabilityOptions.java
>
> I think that it is possible to build the configuration docs page from this
> with reasonable effort.
>
> This would translate the task to:
> 1) Automate ConfigOption to HTML/Markdown generation
> 2) Extend ConfigOption with description fields
> 3) Migrate ConfigConstants to ConfigOptions
>
> I would also volunteer to take a first stab at this.
>
> Regarding the network buffers: +1 to your suggestion. Nico (cc'd) is
> starting to work on automating the network buffer configuration in order to
> get rid of any manual tuning for most users (because of the issues you
> described + streaming and batch jobs require different tuning, which
> complicates things even more).
>
> – Ufuk
>
> On 6 February 2017 at 19:21:28, Greg Hogan (code@greghogan.com) wrote:
> > > Hi devs,
> >
> > Flink's Configuration page [1] has grown intimidatingly long
> > and complex.
> > Options are described across three main sections: common options
> > (single
> > section), advanced options (multiple sections), and full reference.
> > The
> > trailing "background" section further describes the most impactful
> > options
> > in much greater detail.
> >
> > Several recent tickets, and a few outstanding, have added missing
> > options
> > to the configuration documentation. I'd like to propose a goal
> > of
> > organizing all options in the full reference into alphabetized,
> > tabular
> > form (one table per section), much like the system metrics [2].
> > Columns
> > would be option name, description, and default value.
> >
> > The common and advanced sections could also be converted to tabular
> > form
> > with the exception of Kerberos-based Security. Missing options
> > would be
> > added to the full reference.
> >
> > Lastly, the simple heuristic for configuring network buffers
> > has prompted
> > many questions on the mailing list. With the 1.3 release the total
> > and
> > number of available buffers is reported through metrics and
> > in the web
> > dashboard. My experience has been that the number of required
> > buffers is
> > highly dependent on job topology and cluster performance. I
> > propose keeping
> > the simple heuristic and description while directing users
> > to monitor the
> > balance of available buffers.
> >
> > Greg
> >
> > [1] https://ci.apache.org/projects/flink/flink-docs-
> master/setup/config.html
> > [2]
> > https://ci.apache.org/projects/flink/flink-docs-
> master/monitoring/metrics.html#system-metrics
> > [3]
> > https://ci.apache.org/projects/flink/flink-docs-
> master/setup/config.html#configuring-the-network-buffers
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message