flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ufuk Celebi <...@apache.org>
Subject Re: Statistics collection for optimization
Date Tue, 02 Dec 2014 13:43:35 GMT
Have you also thought about adding the statistics collection with the
writers, i.e. the collector or record writer?

If all you care about is the data that the user emits from her code, that
should be fine.

On Tue, Dec 2, 2014 at 2:33 PM, Robert Metzger <rmetzger@apache.org> wrote:

> Yes. I also got the impression that you are looking for something slightly
> different.
>
> It is probably easier for you right now to "hack" something into the system
> to get these statistics.
>
> On Tue, Dec 2, 2014 at 2:25 PM, Alexander Alexandrov <
> alexander.s.alexandrov@gmail.com> wrote:
>
> > I checked the thread. I am not sure whether this is aligned with what I
> > want to contribute.
> >
> > The discussion in the other thread seems to be going in the direction of
> > general-purpose monitoring (you are talking about Disk + Network IO,
> input
> > splits).
> >
> > I would like to have a very thin code base that can be (1) transparently
> > injected in UDFs (if you can manipulate the AST), or wrapped in identity
> > mappers (if you cannot) in order to gather collection statistics (min,
> max,
> > distinct, maybe some histograms) to facilitate incremental optimization.
> >
> > I agree that this should be based on existing infrastructure (Akka) and
> > should not be over over-engineered.
> >
> > I will announce this in the other branch and create a JIRA ticket to fix
> > the parameters of what has to be done and the best way to implement it
> with
> > the other contributors.
> >
> >
> >
> > 2014-12-02 14:12 GMT+01:00 Kostas Tzoumas <ktzoumas@apache.org>:
> >
> > > From the status of that thread and absence of a JIRA (as far as I could
> > > tell), I would suggest that you start working on this and announce it
> on
> > > the other thread, perhaps Nils would be interested in jumping in.
> > >
> > > On Tue, Dec 2, 2014 at 2:06 PM, Ufuk Celebi <uce@apache.org> wrote:
> > >
> > > > Very nice to hear :)
> > > >
> > > > See this thread:
> > > >
> > > >
> > >
> >
> http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Enhance-Flink-s-monitoring-capabilities-td2573.html
> > > >
> > > > On Tue, Dec 2, 2014 at 2:00 PM, Alexander Alexandrov <
> > > > alexander.s.alexandrov@gmail.com> wrote:
> > > >
> > > > > Just a quick shout to check whether somebody is already working on
> a
> > > > > statistics collection component?
> > > > >
> > > > > If yes, can you point me to previous discussions in the mailing
> list
> > > and
> > > > a
> > > > > WIP branch -- I want to bring myself up to date with the ongoing
> > > efforts.
> > > > >
> > > > > If not, I would like to start working on that component and ideally
> > > > > integrate some parts of it in the 0.8 release.
> > > > >
> > > > > Cheers!
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message