flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henry Saputra <henry.sapu...@gmail.com>
Subject Re: Statistics collection for optimization
Date Thu, 04 Dec 2014 02:48:41 GMT
Hi Guys,

I finally have time to look at this. Could you create a design
document to share how would you approach this feature?

There are 2 parts I guess, one is the metrics collection itself and
second is publishing to external either via JMX to external service.

We could use Metrics [1] aka codahale's metrics library to do it.

- Henry

[1] https://github.com/dropwizard/metrics

On Tue, Dec 2, 2014 at 5:25 AM, Alexander Alexandrov
<alexander.s.alexandrov@gmail.com> wrote:
> I checked the thread. I am not sure whether this is aligned with what I
> want to contribute.
>
> The discussion in the other thread seems to be going in the direction of
> general-purpose monitoring (you are talking about Disk + Network IO, input
> splits).
>
> I would like to have a very thin code base that can be (1) transparently
> injected in UDFs (if you can manipulate the AST), or wrapped in identity
> mappers (if you cannot) in order to gather collection statistics (min, max,
> distinct, maybe some histograms) to facilitate incremental optimization.
>
> I agree that this should be based on existing infrastructure (Akka) and
> should not be over over-engineered.
>
> I will announce this in the other branch and create a JIRA ticket to fix
> the parameters of what has to be done and the best way to implement it with
> the other contributors.
>
>
>
> 2014-12-02 14:12 GMT+01:00 Kostas Tzoumas <ktzoumas@apache.org>:
>
>> From the status of that thread and absence of a JIRA (as far as I could
>> tell), I would suggest that you start working on this and announce it on
>> the other thread, perhaps Nils would be interested in jumping in.
>>
>> On Tue, Dec 2, 2014 at 2:06 PM, Ufuk Celebi <uce@apache.org> wrote:
>>
>> > Very nice to hear :)
>> >
>> > See this thread:
>> >
>> >
>> http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Enhance-Flink-s-monitoring-capabilities-td2573.html
>> >
>> > On Tue, Dec 2, 2014 at 2:00 PM, Alexander Alexandrov <
>> > alexander.s.alexandrov@gmail.com> wrote:
>> >
>> > > Just a quick shout to check whether somebody is already working on a
>> > > statistics collection component?
>> > >
>> > > If yes, can you point me to previous discussions in the mailing list
>> and
>> > a
>> > > WIP branch -- I want to bring myself up to date with the ongoing
>> efforts.
>> > >
>> > > If not, I would like to start working on that component and ideally
>> > > integrate some parts of it in the 0.8 release.
>> > >
>> > > Cheers!
>> > >
>> >
>>

Mime
View raw message