flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Metzger <rmetz...@apache.org>
Subject Re: [jira] [Commented] (FLINK-964) Integrate profiling code with web interface
Date Tue, 26 Aug 2014 10:01:32 GMT
Hey Ufuk and Stephan,

you've replied on dev@ to a conversation happening on JIRA. I would suggest
to re-post your messages in JIRA. (there is no automated mirroring).


-- Robert


On Tue, Aug 26, 2014 at 11:57 AM, Stephan Ewen <sewen@apache.org> wrote:

> Very cool first prototype, I like it!
>
> I am posting a quick summary of the status and the other ideas that have
> been floating around in the context of the job profiling:
>
>  - There is quite a bit of profiling data gathered, but I think some stuff
> is also a bit out of date (for example the gate profiling does not work and
> make sense any more because the internal models changed)
>
>  - We are currently thinking to gather data stats (byte and record counts)
> from the operators as well. This could go well together with the profiling.
> It would be good if the profiling code was generic in the sense that it
> allows to transfer arbitrary time series of metrics. It makes sense to
> define scopes for these metrics, such as for example "global (cluster
> profiling)", "singe machine (machine profiling)", "operator", so these
> metrics would be displayed in the web frontend in the respective section.
>
>  - The memory profiling is a bit senseless right now, because the JVMs are
> always of the roughly same memory size, once ramped up. Instead, I would
> add the "managed memory" of Flink.
>
>  - I think a lot of the machine profiling code (cpu utilization, network
> throughput) works currently only on Linux.
>
>
> As a side note: I think it makes sense to integrate the currently separate
> profiling code communication (RPC) with the regular coordination RPCs. That
> is transparent (probably 50 lines) change once we have Till's changes
> merged, which bases the distributed coordination on Akka.
>
>
> On Tue, Aug 26, 2014 at 10:20 AM, Ufuk Celebi <uce@apache.org> wrote:
>
> > This GSoC proposal [1] might also be of interest.
> >
> > [1]
> >
> >
> https://github.com/stratosphere/stratosphere/wiki/GSoC-2014-Project-Proposal-Draft-by-Rajika-Kumarasiri
> >
> >
> > On Tue, Aug 26, 2014 at 10:12 AM, Sebastian Kruse (JIRA) <
> jira@apache.org>
> > wrote:
> >
> > >
> > >     [
> > >
> >
> https://issues.apache.org/jira/browse/FLINK-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14110468#comment-14110468
> > > ]
> > >
> > > Sebastian Kruse commented on FLINK-964:
> > > ---------------------------------------
> > >
> > > Hey guys,
> > >
> > > I am happy to hear that you like it! :)
> > >
> > > But please also consider that this prototype was thought as a first
> spike
> > > and baseline for further discussion. There is a lot more profiling data
> > > available, e.g., stats per task manager and execution vertex. I propose
> > to
> > > have a bit of a discussion about what of those data to include and how.
> > >
> > > Cheers,
> > > Sebastian
> > >
> > > > Integrate profiling code with web interface
> > > > -------------------------------------------
> > > >
> > > >                 Key: FLINK-964
> > > >                 URL: https://issues.apache.org/jira/browse/FLINK-964
> > > >             Project: Flink
> > > >          Issue Type: Improvement
> > > >          Components: Local Runtime, Webfrontend
> > > >    Affects Versions: 0.6-incubating
> > > >            Reporter: Stephan Ewen
> > > >            Assignee: Jonathan Hasenburg
> > > >
> > > > This issue is subject to discussion.
> > > > The profiling code currently needs to be kept in sync with the job
> > graph
> > > code, execution graph code, and runtime code.
> > > > Since that part of the code is undergoing quite some changes and the
> > > profiling code is not used right now, I suggest to remove it, or move
> it
> > to
> > > an artifact repository.
> > >
> > >
> > >
> > > --
> > > This message was sent by Atlassian JIRA
> > > (v6.2#6252)
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message