flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Till Rohrmann <trohrm...@apache.org>
Subject Re: Adding a Histogram Metric
Date Thu, 16 Jun 2016 13:33:20 GMT
I agree that dropwizard already offers a lot of functionality and we
shouldn't reimplement it. Therefore, I've introduced a Flink histogram
interface which is implemented by a dropwizard histogram wrapper. That way
Flink has it's own histogram abstraction and dropwizard histograms can be
used with Flink via this wrapper class. See the PR [1] for more information.

[1] https://github.com/apache/flink/pull/2112

Cheers,
Till

On Tue, Jun 14, 2016 at 6:05 PM, Cody Innowhere <e.neverme@gmail.com> wrote:

> Counter of dropwizard is thread-safe.
> I think dropwizard metrics are implemented fairly well and used quite
> widely in open source projects, I personally on the side of using
> dropwizard metrics rather than re-implement them, unless for performance
> reasons. Still, I'm +1 for adding a wrapper on top of dropwizard metrics.
>
> On Tue, Jun 14, 2016 at 10:45 PM, Till Rohrmann <trohrmann@apache.org>
> wrote:
>
> > +1 for the thread safe metrics. This should be a rather low hanging fruit
> > and easily added.
> >
> > If we decide to add a histogram, then I would also be in favour of
> > implementing our own version of a histogram. This avoids adding a hard
> > dependency on Dropwizard or another metrics library to Flink core. Adding
> > our own implementation would of course entail to also add a Dropwizard
> > wrapper for reporting. Thus, if a user component required a Dropwizard
> > histogram, then one could simply use the wrapper.
> >
> > Alternatively, one could also rely on external system to compute
> > histograms. For example, statsD supports the generation of histograms
> from
> > a stream of measurements. However, these histograms couldn't be used
> within
> > Flink.
> >
> > Implementation wise, the histogram would most likely follow the
> > implementation of Dropwizard's histogram:
> >
> > The basic idea is that a histogram can add samples to a reservoir which
> it
> > uses to calculate a set of statistics when queried. The statistics
> > comprises percentiles, mean, standard deviation and number of elements,
> for
> > example.
> >
> > The reservoir defines the strategy of how to sample the input stream.
> There
> > are different strategies conceivable: Uniform sampling, which constructs
> a
> > long term distribution of the seen elements, exponentially decaying
> > sampling, which favours more recent elements, sliding window or buckets.
> >
> > The question is now whether such an implementation already covers most
> use
> > cases or whether histograms should support more functionaly. Feedback is
> > highly appreciated :-)
> >
> > Cheers,
> > Till
> >
> > On Mon, Jun 13, 2016 at 6:37 PM, Stephan Ewen <sewen@apache.org> wrote:
> >
> > > I think it is totally fine to add a "ThreadsafeCounter" that uses an
> > atomic
> > > long internally
> > >
> > > On Sat, Jun 11, 2016 at 7:25 PM, Steve Cosenza <scosenza@twitter.com>
> > > wrote:
> > >
> > > > Also, we may be able to avoid the need for concurrent metrics by
> > > > configuring each Finagle source to use a single thread. We'll
> > investigate
> > > > the performance implications this week and update you with the
> results.
> > > >
> > > > Thanks,
> > > > Steve
> > > >
> > > >
> > > > On Friday, June 10, 2016, Steve Cosenza <scosenza@twitter.com>
> wrote:
> > > >
> > > >> +Chris Hogue who is also working on operationalizing Flink with me
> > > >>
> > > >> Hi Stephan,
> > > >>
> > > >> Thanks for the background on your current implementations!
> > > >>
> > > >> While we don't require a specific implementation for histogram,
> > counter,
> > > >> or gauge, it just became clear that we'll need threadsafe versions
> of
> > > all
> > > >> three of these metrics. This is because our messaging source is
> > > implemented
> > > >> using Finagle, and Finagle expects to be able to emit stats
> > concurrently
> > > >> from its managed threads.
> > > >>
> > > >> That being said, if adding threadsafe versions of the Flink counters
> > is
> > > >> not an option, we'd also be fine with directly reading and writing
> > from
> > > the
> > > >> singleton Codahale MetricsRegistry that you start up in each
> > > TaskManager.
> > > >>
> > > >> Thanks,
> > > >> Steve
> > > >>
> > > >> On Fri, Jun 10, 2016 at 7:10 AM, Stephan Ewen <sewen@apache.org>
> > wrote:
> > > >>
> > > >>> A recent discussion brought up the point of adding a "histogram"
> > metric
> > > >>> type to Flink. This open thread is to gather some more of the
> > > requirements
> > > >>> for that metric.
> > > >>>
> > > >>> The most important question is whether users need Flink to offer
> > > >>> specific implementations of "Histogram", like for example the
"
> > > >>> com.codahale.metrics.Histogram", or if a "
> > > >>> org.apache.flink.metrics.Histogram" interface would work as well.
> > > >>> The histogram could still be reported for example via dropwizard
> > > >>> reporters.
> > > >>>
> > > >>> *Option (1):* If a Flink Histogram works as well, it would be
> simple
> > to
> > > >>> add one. The dropwizard reporter would need to wrap the Flink
> > > Histogram for
> > > >>> reporting.
> > > >>>
> > > >>> *Option (2)*: If the code needs the specific Dropwizard Histogram
> > type,
> > > >>> then one would need a wrapper class that makes a Flink Histogram
> look
> > > like
> > > >>> a dropwizard histogram.
> > > >>>
> > > >>> ----------
> > > >>>
> > > >>> As a bit of background for the discussion, here are some thoughts
> > > behind
> > > >>> the way that Metrics are currently implemented in Flink.
> > > >>>
> > > >>>   - The metric types in Flink are independent from libraries like
> > > >>> "dropwizard" to reduce dependencies and retain freedom to swap
> > > >>> implementations.
> > > >>>
> > > >>>   - Metric reporting allows to reuse reporters from dropwizard
> > > >>>
> > > >>>   - Some Flink metric implementations are also more lightweight
> than
> > > for
> > > >>> example in dropwizard. Counters for example are not thread safe,
> but
> > > do not
> > > >>> impose memory barriers. That is important for metrics deep in
the
> > > streaming
> > > >>> runtime.
> > > >>>
> > > >>>
> > > >>>
> > > >>
> > > >
> > > > --
> > > > -Steve
> > > >
> > > > Sent from Gmail Mobile
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message