ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maxim Muzafarov <maxmu...@gmail.com>
Subject Re: [IEP-35] Monitoring & Profiling. Proof of concept
Date Tue, 30 Apr 2019 11:33:55 GMT
Hello Nikolay,

I've looked through your PRs changes.

> Sensors

How will be recorded throughput sensor values which will require an
interval for the rate calculations? Do we have such an example? For
instance, getAllocationRate() or getEvictionRate(). These metrics are
out of the scope of current PoC and IEP as they are not related to the
user metrics, but it is a good example of a particular metric type.

It seems to me that we can add an additional parameter of
`sensitivityLevel` to provide for the user a flexible sensor control
(e.g., INFO, WARN, NOTICE, DEBUG).

It also seems that for the sensors getValue() the completely
functional java approach can be used. Am I right?

On Mon, 29 Apr 2019 at 11:44, Nikolay Izhikov <nizhikov@apache.org> wrote:
>
> Hello, Vyacheslav.
>
> Thanks for the feedback!
>
> > HttpExposer with Jetty's dependencies should be detached> from the core module.
>
> Agreed. module hierarchy is the essence of the next steps.
> For now it just a proof of my ideas for Ignite monitoring we can discuss.
>
> > I like your approach with 'wrapper' for monitored objects, like don't like using
'ServiceConfiguration' directly as a monitored object for services
>
> Agreed in general.
> Seems, choosing the right data to expose is the matter of separate discussion for each
Ignite entities.
> I've planned to file tickets for each entity so anyone interested can share his vision
in it.
>
> > In my opinion, each sensor should have a timestamp.
>
> I'm not sure that *every* sensor should have directly associated timestamp.
> Seems, we should support sensors without timestamp for a current monitoring numbers at
least.
>
> > Also, it'd be great to have an ability to store a list of a fixed size> of last
N sensors
>
> What use-cases do you know for such sensors?
> We have plans to support fixed size lists to show "Last N SQL queries" or similar data.
> Essentially, a sensor is just a single value with the name and known meaning.
>
> > It'd be great if you provide a more extended test to show the work of> the system.
>
> Sorry, for that :)
> When you run 'MonitoringSelfTest' you should open http://localhost:8080/ignite/monitoring
to view exposed info.
> I provide this info in gist - https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
>
> I will extend this test to print results to console in the next iterations - stay tuned
:)
>
> В Вс, 28/04/2019 в 23:35 +0300, Vyacheslav Daradur пишет:
> > Hi, Nikolay,
> >
> > I looked through PR and IEP, and I have some comments:
> >
> > It would be better to implement it as a separate module, I can't say
> > if it is possible for the main part of monitoring or not, but I
> > believe that HttpExposer with Jetty's dependencies should be detached
> > from the core module.
> >
> > I like your approach with 'wrapper' for monitored objects, like
> > 'ComputeTaskInfo' in PR, and don't like using 'ServiceConfiguration'
> > directly as a monitored object for services. I believe we shouldn't
> > mix approaches. It'd be better always use some kind of container with
> > monitored object's information to work with such data.
> >
> > In my opinion, each sensor should have a timestamp. Usually monitoring
> > systems aggregate data and build graphics according to sensors
> > timestamp.
> >
> > Also, it'd be great to have an ability to store a list of a fixed size
> > of last N sensors, not to miss them without pushing to an external
> > monitoring system.
> >
> > It'd be great if you provide a more extended test to show the work of
> > the system. Everybody who looks to PR needs to run the test and get
> > the info manually to see the completeness of sensors, this might be
> > simplified by proper test.
> >
> > Thank you!
> >
> >
> >
> > On Fri, Apr 26, 2019 at 5:56 PM Nikolay Izhikov <nizhikov@apache.org> wrote:
> > >
> > > Hello, Igniters.
> > >
> > > I've prepared Proof of Concept for IEP-35 [1]
> > > PR can be found here - https://github.com/apache/ignite/pull/6510
> > >
> > > I've done following changes:
> > >
> > >         1. `GridMonitoringManager`  [2] - simple implementation of manager
to store all monitoring info
> > >         2. `HttpPullExposerSpi` [3] - pull exposer implementation that can
respond with JSON from http://localhost:8080/ignite/monitoring. JSON content can be veiwed
in gist [4]
> > >         3. Compute task start and finish monitoring in "compute" list [5]
> > >         4. Service registration are monitored in "service" list - [6]
> > >         5. Current `IgniteSpiMBeanAdapter` rewritten using `GridMonitoringManager`
[7]
> > >
> > > Design principles, monitoring subsystem details and new Ignite entities can
be found in IEP [1].
> > >
> > > My next steps will be:
> > >
> > >         1. Implementation of JMX exposer
> > >         2. Registration of all "lists" and "sensor groups" as a SQL System
view.
> > >         3. Add monitoring for all unmonitoring Ignite API. (described in IEP).
> > >         4. Rewrite existing jmx metrics using GridMonitoringManager.
> > >
> > > Please, share you thoughts.
> > >
> > > Part of JSON file:
> > > ```
> > >     "COMPUTE": {
> > >       "tasks": {
> > >         "name": "tasks",
> > >         "rows": [
> > >           {
> > >             "id": "0798817a-eeec-4386-9af7-94edb39ffced",
> > >             "sessionId": "a1814f95a61-912451ff-ca7b-4764-a7fd-728f6a900000",
> > >             "data": {
> > >               "taskClasName": "org.apache.ignite.monitoring.MonitoringSelfTest$$Lambda$145/1500885480",
> > >               "startTime": 1556287337944,
> > >               "timeout": 9223372036854776000,
> > >               "execName": null
> > >             },
> > >             "name": "anotherBroadcast"
> > >           }
> > > ```
> > >
> > > [1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > [2] https://github.com/apache/ignite/pull/6510/files#diff-ec7d5cf5e35b99303deb9accee153c50R34
> > > [3] https://github.com/apache/ignite/pull/6510/files#diff-32239c45e0ae3b692af2eae7078e1436R47
> > > [4] https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > > [5] https://github.com/apache/ignite/pull/6510/files#diff-d651ed29d07bd0c5ce291654a3254cc0R749
> > > [6] https://github.com/apache/ignite/pull/6510/files#diff-0b4e54fbda2b0da1c10eff48416336f6R1606
> > > [7] https://github.com/apache/ignite/pull/6510/files#diff-4398bf118150500e059069b3a1638ec7R61
> >
> >
> >

Mime
View raw message