ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrey Gura <ag...@apache.org>
Subject Re: Cache operations performance metrics
Date Thu, 19 Dec 2019 13:03:31 GMT
>From my point of view, Ignite should provide meaningful metrics for
internal components that could be useful for monitoring and analysis.
All suggested options are meaningless in a sense. Below I'll try
explain why.

>* `get`, `put`, `remove` time histograms. Measured for API calls on the caller node side.
>    Implemented in [1], commit [2].

All cache operations in Ignite are distributed. So each value measured
for some cache operation will vary depending on where actually
operation is performed. Values will the same only for cases when node
is remote relative to data (e.g. client node).

For regular data node (server node) timing will depend on answers for question:

- is node primary for particular key or not? (for all operations)
- how many backups configured for the cache? (for put and remove)
- what write synchronization mode is configured for particular cache?
(for put and remove)
- is readFromBackup enabled for the cache? (for get)

Both Ignite users and Ignite developers can't make any decision based
on this metrics.

> * `commit`, `rollback` time histograms. Measured for API calls on the caller node side
[3].

What is transaction commit or rollback time? How it calculates in
Ignite now? What actions included into transaction? What actions not
related with cache executed during transactions?

There is no any sense in time of transaction commit or rollback
because there are no any way to understand what transaction was
performed in particular period of time. Usually a lot of transactions
and we can't to distinguish from each other.

Moreover, transaction usually treats as business operation. So only
way to measure performance properly is measure business operation
time. That is user should create own metrics set for some business
API.

Further. What about cross cache transactions? At the moment tx
commit/rollback time will be added to corresponding metrics per each
cache evolved to the transaction. The *same time* for *each cache*.
Absolutely meaningless.

Again, both Ignite users and Ignite developers can't make any decision
based on this metrics. But users can create own metrics set.

>* histograms that measure the time of processing `get`, `put`, `remove`, `commit`, `rollback`
messages on affinity nodes(primary and backups).
>    Ticket doesn't exist for it.

It will be implemented for most types of messages.

Metrics, application monitoring, performance analysis and measurement
are a a little harder than it sounds. Therefore, we must approach this
issue more carefully.
Blindly adding new types of metrics will not only not improve the
situation, but will also worsen the overall performance of the system
because metric calculation always on the hot path.

So, from my point of view, commits for get/put/remove and
commit/rollback should be reverted.

On Mon, Dec 16, 2019 at 5:39 PM Nikita Amelchev <nsamelchev@gmail.com> wrote:
>
> I think these metrics are useful.
>
> I have prepared PR [1] for commit and rollback histograms. [2]
> Nikolay, could you take a look, please?
>
> If you do not mind, I will try to add affinity-nodes cache metrics:
> >> * histograms that measure the time of processing `get`, `put`, `remove`, `commit`,
`rollback` messages on affinity nodes(primary and backups). Ticket doesn't exist for it.
>
> I have filed a ticket for it. [3]
>
> [1] https://github.com/apache/ignite/pull/7141
> [2] https://issues.apache.org/jira/browse/IGNITE-12450
> [3] https://issues.apache.org/jira/browse/IGNITE-12453
>
> пн, 16 дек. 2019 г. в 11:07, Alexei Scherbakov <alexey.scherbakoff@gmail.com>:
> >
> > I think they are very useful.
> >
> > пн, 16 дек. 2019 г. в 10:51, Николай Ижиков <nizhikov@apache.org>:
> >
> > > Hello, Alexei.
> > >
> > > Thanks for the link on the ticket, lableled it with the IEP-35 label.
> > > What do you think about proposed metrics set?
> > >
> > > > 16 дек. 2019 г., в 10:29, Alexei Scherbakov <
> > > alexey.scherbakoff@gmail.com> написал(а):
> > > >
> > > > Nikolay,
> > > >
> > > > What about batch operations?
> > > >
> > > > For messages processing the ticket does exist and even has an
> > > > implementation from before new metrics API times [1]
> > > >
> > > > [1] https://issues.apache.org/jira/browse/IGNITE-10418
> > > >
> > > > пн, 16 дек. 2019 г. в 10:12, Николай Ижиков <nizhikov@apache.org>:
> > > >
> > > >> Hello, Igniters.
> > > >>
> > > >> I want to provide the user answers to the following question: "How
cache
> > > >> API operations perform?"
> > > >> It seems, we need to implements metrics for basic cache API operations
> > > >> like get, put, remove for it.
> > > >>
> > > >> I think we should provide the following metrics:
> > > >>
> > > >> * `get`, `put`, `remove` time histograms. Measured for API calls on
the
> > > >> caller node side.
> > > >>    Implemented in [1], commit [2].
> > > >>
> > > >> * `commit`, `rollback` time histograms. Measured for API calls on
the
> > > >> caller node side [3].
> > > >>
> > > >> * histograms that measure the time of processing `get`, `put`, `remove`,
> > > >> `commit`, `rollback` messages on affinity nodes(primary and backups).
> > > >>    Ticket doesn't exist for it.
> > > >>
> > > >> What do you think?
> > > >>
> > > >> [1] https://issues.apache.org/jira/browse/IGNITE-12219
> > > >> [2]
> > > >>
> > > https://github.com/apache/ignite/commit/e66bbef97b2cef73a533ce8a506ec479852cb364
> > > >> [3] https://issues.apache.org/jira/browse/IGNITE-12450
> > > >>
> > > >
> > > >
> > > > --
> > > >
> > > > Best regards,
> > > > Alexei Scherbakov
> > >
> > >
> >
> > --
> >
> > Best regards,
> > Alexei Scherbakov
>
>
>
> --
> Best wishes,
> Amelchev Nikita

Mime
View raw message