flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Cosenza <scose...@twitter.com.INVALID>
Subject Re: Adding a Histogram Metric
Date Sat, 18 Jun 2016 15:18:01 GMT
Perfect!

-Steve

On Saturday, June 18, 2016, Chesnay Schepler <chesnay@apache.org> wrote:

> it would be scoped and reported just like any other Counter.
>
> Regards,
> Chesnay
>
> On 18.06.2016 16:04, Steve Cosenza wrote:
>
> Will the added metric be scoped appropriately (e.g. per operator)?
> Also, if the added metric is a Counter will it be available when listing
> counters in AbstractReporter?
>
> -Steve
>
> On Friday, June 17, 2016, Chesnay Schepler <
> <javascript:_e(%7B%7D,'cvml','chesnay@apache.org');>chesnay@apache.org
> <javascript:_e(%7B%7D,'cvml','chesnay@apache.org');>> wrote:
>
>> That is currently not possible. We would have expose the internal
>> addMetric(String name, Metric metric) method.
>>
>> Regards,
>> Chesnay
>>
>> On 18.06.2016 04:48, Steve Cosenza wrote:
>>
>> Hi Till,
>>
>> How would I plugin a custom counter so that I could use the existing
>> MetricGroup and AbstractReporter functionality?
>>
>> Steve
>>
>> On Friday, June 17, 2016, Till Rohrmann <trohrmann@apache.org
>> <javascript:_e(%7B%7D,'cvml','trohrmann@apache.org');>> wrote:
>>
>>> Hi Steve,
>>>
>>> I thought again about the thread-safe counters and I'm not so sure
>>> anymore whether we should add them or not. The reason is that we could also
>>> mark the Counter class as not final. Then it would be possible to define
>>> your user-defined counters which could be thread-safe. This would reduce
>>> the complexity on the Flink side since covering all metrics use cases might
>>> be difficult. Would that work for you? What do the others think about it?
>>>
>>> Cheers,
>>> Till
>>>
>>> On Thu, Jun 16, 2016 at 3:33 PM, Till Rohrmann <trohrmann@apache.org
>>> <javascript:_e(%7B%7D,'cvml','trohrmann@apache.org');>> wrote:
>>>
>>>> I agree that dropwizard already offers a lot of functionality and we
>>>> shouldn't reimplement it. Therefore, I've introduced a Flink histogram
>>>> interface which is implemented by a dropwizard histogram wrapper. That way
>>>> Flink has it's own histogram abstraction and dropwizard histograms can be
>>>> used with Flink via this wrapper class. See the PR [1] for more
>>>> information.
>>>>
>>>> [1] https://github.com/apache/flink/pull/2112
>>>>
>>>> Cheers,
>>>> Till
>>>>
>>>> On Tue, Jun 14, 2016 at 6:05 PM, Cody Innowhere <e.neverme@gmail.com>
>>>> wrote:
>>>>
>>>>> Counter of dropwizard is thread-safe.
>>>>> I think dropwizard metrics are implemented fairly well and used quite
>>>>> widely in open source projects, I personally on the side of using
>>>>> dropwizard metrics rather than re-implement them, unless for
>>>>> performance
>>>>> reasons. Still, I'm +1 for adding a wrapper on top of dropwizard
>>>>> metrics.
>>>>>
>>>>> On Tue, Jun 14, 2016 at 10:45 PM, Till Rohrmann <trohrmann@apache.org>
>>>>> wrote:
>>>>>
>>>>> > +1 for the thread safe metrics. This should be a rather low hanging
>>>>> fruit
>>>>> > and easily added.
>>>>> >
>>>>> > If we decide to add a histogram, then I would also be in favour
of
>>>>> > implementing our own version of a histogram. This avoids adding
a
>>>>> hard
>>>>> > dependency on Dropwizard or another metrics library to Flink core.
>>>>> Adding
>>>>> > our own implementation would of course entail to also add a
>>>>> Dropwizard
>>>>> > wrapper for reporting. Thus, if a user component required a
>>>>> Dropwizard
>>>>> > histogram, then one could simply use the wrapper.
>>>>> >
>>>>> > Alternatively, one could also rely on external system to compute
>>>>> > histograms. For example, statsD supports the generation of
>>>>> histograms from
>>>>> > a stream of measurements. However, these histograms couldn't be
used
>>>>> within
>>>>> > Flink.
>>>>> >
>>>>> > Implementation wise, the histogram would most likely follow the
>>>>> > implementation of Dropwizard's histogram:
>>>>> >
>>>>> > The basic idea is that a histogram can add samples to a reservoir
>>>>> which it
>>>>> > uses to calculate a set of statistics when queried. The statistics
>>>>> > comprises percentiles, mean, standard deviation and number of
>>>>> elements, for
>>>>> > example.
>>>>> >
>>>>> > The reservoir defines the strategy of how to sample the input
>>>>> stream. There
>>>>> > are different strategies conceivable: Uniform sampling, which
>>>>> constructs a
>>>>> > long term distribution of the seen elements, exponentially decaying
>>>>> > sampling, which favours more recent elements, sliding window or
>>>>> buckets.
>>>>> >
>>>>> > The question is now whether such an implementation already covers
>>>>> most use
>>>>> > cases or whether histograms should support more functionaly.
>>>>> Feedback is
>>>>> > highly appreciated :-)
>>>>> >
>>>>> > Cheers,
>>>>> > Till
>>>>> >
>>>>> > On Mon, Jun 13, 2016 at 6:37 PM, Stephan Ewen <sewen@apache.org
>>>>> <javascript:_e(%7B%7D,'cvml','sewen@apache.org');>> wrote:
>>>>> >
>>>>> > > I think it is totally fine to add a "ThreadsafeCounter" that
uses
>>>>> an
>>>>> > atomic
>>>>> > > long internally
>>>>> > >
>>>>> > > On Sat, Jun 11, 2016 at 7:25 PM, Steve Cosenza <
>>>>> scosenza@twitter.com
>>>>> <javascript:_e(%7B%7D,'cvml','scosenza@twitter.com');>>
>>>>> > > wrote:
>>>>> > >
>>>>> > > > Also, we may be able to avoid the need for concurrent
metrics by
>>>>> > > > configuring each Finagle source to use a single thread.
We'll
>>>>> > investigate
>>>>> > > > the performance implications this week and update you
with the
>>>>> results.
>>>>> > > >
>>>>> > > > Thanks,
>>>>> > > > Steve
>>>>> > > >
>>>>> > > >
>>>>> > > > On Friday, June 10, 2016, Steve Cosenza <scosenza@twitter.com
>>>>> <javascript:_e(%7B%7D,'cvml','scosenza@twitter.com');>> wrote:
>>>>> > > >
>>>>> > > >> +Chris Hogue who is also working on operationalizing
Flink with
>>>>> me
>>>>> > > >>
>>>>> > > >> Hi Stephan,
>>>>> > > >>
>>>>> > > >> Thanks for the background on your current implementations!
>>>>> > > >>
>>>>> > > >> While we don't require a specific implementation for
histogram,
>>>>> > counter,
>>>>> > > >> or gauge, it just became clear that we'll need threadsafe
>>>>> versions of
>>>>> > > all
>>>>> > > >> three of these metrics. This is because our messaging
source is
>>>>> > > implemented
>>>>> > > >> using Finagle, and Finagle expects to be able to emit
stats
>>>>> > concurrently
>>>>> > > >> from its managed threads.
>>>>> > > >>
>>>>> > > >> That being said, if adding threadsafe versions of
the Flink
>>>>> counters
>>>>> > is
>>>>> > > >> not an option, we'd also be fine with directly reading
and
>>>>> writing
>>>>> > from
>>>>> > > the
>>>>> > > >> singleton Codahale MetricsRegistry that you start
up in each
>>>>> > > TaskManager.
>>>>> > > >>
>>>>> > > >> Thanks,
>>>>> > > >> Steve
>>>>> > > >>
>>>>> > > >> On Fri, Jun 10, 2016 at 7:10 AM, Stephan Ewen <sewen@apache.org
>>>>> <javascript:_e(%7B%7D,'cvml','sewen@apache.org');>>
>>>>> > wrote:
>>>>> > > >>
>>>>> > > >>> A recent discussion brought up the point of adding
a
>>>>> "histogram"
>>>>> > metric
>>>>> > > >>> type to Flink. This open thread is to gather some
more of the
>>>>> > > requirements
>>>>> > > >>> for that metric.
>>>>> > > >>>
>>>>> > > >>> The most important question is whether users need
Flink to
>>>>> offer
>>>>> > > >>> specific implementations of "Histogram", like
for example the "
>>>>> > > >>> com.codahale.metrics.Histogram", or if a "
>>>>> > > >>> org.apache.flink.metrics.Histogram" interface
would work as
>>>>> well.
>>>>> > > >>> The histogram could still be reported for example
via
>>>>> dropwizard
>>>>> > > >>> reporters.
>>>>> > > >>>
>>>>> > > >>> *Option (1):* If a Flink Histogram works as well,
it would be
>>>>> simple
>>>>> > to
>>>>> > > >>> add one. The dropwizard reporter would need to
wrap the Flink
>>>>> > > Histogram for
>>>>> > > >>> reporting.
>>>>> > > >>>
>>>>> > > >>> *Option (2)*: If the code needs the specific Dropwizard
>>>>> Histogram
>>>>> > type,
>>>>> > > >>> then one would need a wrapper class that makes
a Flink
>>>>> Histogram look
>>>>> > > like
>>>>> > > >>> a dropwizard histogram.
>>>>> > > >>>
>>>>> > > >>> ----------
>>>>> > > >>>
>>>>> > > >>> As a bit of background for the discussion, here
are some
>>>>> thoughts
>>>>> > > behind
>>>>> > > >>> the way that Metrics are currently implemented
in Flink.
>>>>> > > >>>
>>>>> > > >>>   - The metric types in Flink are independent
from libraries
>>>>> like
>>>>> > > >>> "dropwizard" to reduce dependencies and retain
freedom to swap
>>>>> > > >>> implementations.
>>>>> > > >>>
>>>>> > > >>>   - Metric reporting allows to reuse reporters
from dropwizard
>>>>> > > >>>
>>>>> > > >>>   - Some Flink metric implementations are also
more
>>>>> lightweight than
>>>>> > > for
>>>>> > > >>> example in dropwizard. Counters for example are
not thread
>>>>> safe, but
>>>>> > > do not
>>>>> > > >>> impose memory barriers. That is important for
metrics deep in
>>>>> the
>>>>> > > streaming
>>>>> > > >>> runtime.
>>>>> > > >>>
>>>>> > > >>>
>>>>> > > >>>
>>>>> > > >>
>>>>> > > >
>>>>> > > > --
>>>>> > > > -Steve
>>>>> > > >
>>>>> > > > Sent from Gmail Mobile
>>>>> > > >
>>>>> > >
>>>>> >
>>>>>
>>>>
>>>>
>>>
>>
>> --
>> -Steve
>>
>> Sent from Gmail Mobile
>>
>>
>>
>
> --
> -Steve
>
> Sent from Gmail Mobile
>
>
>

-- 
-Steve

Sent from Gmail Mobile

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message