ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Setrakyan <dsetrak...@apache.org>
Subject Re: Additional SQL metrics
Date Fri, 03 Mar 2017 19:01:05 GMT
Hm... as a user I would be interested to know that, say, 95% of my "select
* from sometable where..." query executes under 10ms or so.

I think holding some history is important and is not that hard to implement.

D.

On Fri, Mar 3, 2017 at 10:55 AM, Denis Magda <dmagda@apache.org> wrote:

> Sergey, agree, good point!
>
> Igniters, any other thoughts before we wrap up the discussion updating the
> ticket content?
>
> —
> Denis
>
> > On Mar 3, 2017, at 10:06 AM, Valentin Kulichenko <
> valentin.kulichenko@gmail.com> wrote:
> >
> > Sergey, that's great idea! Generally, user is not interested much in some
> > average numbers, especially in case of SQL queries. What they need is a
> > list of slow queries and detailed information about the execution flow of
> > these particular queries.
> >
> > -Val
> >
> > On Fri, Mar 3, 2017 at 2:50 AM, Sergey Kozlov <skozlov@gridgain.com>
> wrote:
> >
> >> One more comment:
> >>
> >> In general the customer is interested in slow queries details thus we
> can
> >> introduce an option which will allow to store only queries executed more
> >> than NNN seconds. It may significantly reduce the the memory consumption
> >> for history (but logging of all queries is still available if set that
> >> option to 0).
> >>
> >> On Fri, Mar 3, 2017 at 1:00 AM, Denis Magda <dmagda@apache.org> wrote:
> >>
> >>> Vovan,
> >>>
> >>> When I’m speaking of JOIN metrics I’m simply assume that we need to
add
> >>> metrics relevant for queries with joins, metrics that will help us get
> >> more
> >>> insights on non-collocated and collocated joins execution flow.
> >>>
> >>>> 1) Query exec count
> >>>> 2) Query exec time (first define what "time" means) - min, max, avg
> >>>
> >>> Total query execution time might not be helpful in the trickiest cases.
> >>> What if you have multiple joins in your query? How do I know which one
> >>> contributes to the execution most?
> >>>
> >>> So, I would do a breakdown having total time, map time, per-join time,
> >>> reduce time. Hope it’s possible. If it’s unclear how to support
> >> everything
> >>> at the first place then it’s a different question. Let’s create several
> >>> tickets and start implementing everything gracefully.
> >>>
> >>>> 3) Number of bytes exchanged between nodes during query execution
> >>>
> >>> It will be really helpful to make a breakdown showing a number of bytes
> >>> exchanged per-join (physical join). Again, if you believe it makes
> sense
> >> to
> >>> do the breakdown later then let’s create an additional ticket for the
2
> >>> tier metrics then.
> >>>
> >>>> 4) Number of returned rows - min, max, avg
> >>>
> >>>
> >>> Plus, let’s add the following to the list:
> >>>
> >>> 5) Collocated: yes or no
> >>>
> >>> —
> >>> Denis
> >>>
> >>>> On Mar 2, 2017, at 1:32 PM, Vladimir Ozerov <vozerov@gridgain.com>
> >>> wrote:
> >>>>
> >>>> Denis,
> >>>>
> >>>> The main problem with suggested metrics is that they implies that
> >> ceratin
> >>>> internal mechanics work in predefined way. For example, what is JOIN
> >>>> metrics? There are no guarantees that JOIN in user's query will be
> >>>> translated to a real physical join. What if several different query
> >>>> execution pieces happen in parallel? What if we rework our distributed
> >>>> query engine from pull to push approach for performance reasons and
> >> there
> >>>> will be no JOINs in classical sense?
> >>>>
> >>>> This is why I think that we should start with very basic things.
> >>> Something
> >>>> like:
> >>>> 1) Query exec count
> >>>> 2) Query exec time (first define what "time" means) - min, max, avg
> >>>> 3) Number of bytes exchanged between nodes during query execution
> >>>> 4) Number of returned rows - min, max, avg
> >>>>
> >>>> Once we have base numbers in place, we can think of carefully
> >> integrating
> >>>> and enhancing all pieces of query execution into more verbose formats,
> >>>> similar to query plans with relative weights in classical RDBMS
> >> systems.
> >>>>
> >>>> Thoughts?
> >>>>
> >>>>
> >>>>
> >>>> On Thu, Mar 2, 2017 at 11:31 PM, Denis Magda <dmagda@apache.org>
> >> wrote:
> >>>>
> >>>>> Vovan,
> >>>>>
> >>>>> Your metrics make perfect sense to me. However, I see a high demand
> >> for
> >>>>> JOINs based metrics especially from those who give a try to
> >>> non-collocated
> >>>>> joins in production  and want to measure them somehow. This is why,
> >>>>> personally, I prefer to see the metrics below in the top priority
> list
> >>> as
> >>>>> well:
> >>>>>
> >>>>> if a query was executed in the collocated or non-collocated mode.
> >> Three
> >>>>> results are valid: collocated, non-collocated, simple query (no
> >> joins).
> >>>>> non-collocated query: size of the data exchanged between the nodes
to
> >>>>> complete a specific join. If there are multiple joins in the query
we
> >>> need
> >>>>> to provide this metric for every of them.
> >>>>> non-collocated and collocated query: a part of the time spent joining
> >>> the
> >>>>> data. If there are multiple joins in the query we need to provide
> this
> >>>>> metric for every of them.
> >>>>>
> >>>>> As for “unicast” and “broadcast”, agree, let’s ignore
it for now.
> >>>>>
> >>>>> In any case, can we include timing information (map phase, reduce
> >> phase,
> >>>>> join phase) into an execution plan produced by H2? Are there any
> >>>>> implementation hooks?
> >>>>>
> >>>>> —
> >>>>> Denis
> >>>>>
> >>>>>
> >>>>>> On Mar 2, 2017, at 12:02 PM, Dmitriy Setrakyan <
> >> dsetrakyan@apache.org>
> >>>>> wrote:
> >>>>>>
> >>>>>> I think some of the metrics specified by Denis also make sense,
so I
> >>>>> would
> >>>>>> add them as well. See below...
> >>>>>>
> >>>>>> On Thu, Mar 2, 2017 at 12:36 AM, Vladimir Ozerov <
> >> vozerov@gridgain.com
> >>>>> <mailto:vozerov@gridgain.com>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Denis,
> >>>>>>>
> >>>>>>> Query execution is complex process involving different stages
which
> >>> are
> >>>>> not
> >>>>>>> very easy to match with each other. Especially provided
that any
> >> node
> >>>>> can
> >>>>>>> leave topology at any time. Another problem is that engine
evolves
> >> and
> >>>>>>> metrics like "did a query do broadcast or unicast" may easily
> become
> >>>>>>> useless at some point, because for example there will be
neither
> >>>>> unicast,
> >>>>>>> nor broadast, but something different. On the other hand
I
> >> completely
> >>>>> agree
> >>>>>>> that performance monitoring is essential part of any mature
DBMS.
> >>>>>>>
> >>>>>>> I would start with metrics which are both very basic and
easy to
> >>>>> implement
> >>>>>>> at the same time. For example we can add fingerprint (hash)
to
> every
> >>>>> query
> >>>>>>> which will be used to join "map" and "reduce" parts with
each other
> >>> and
> >>>>> add
> >>>>>>> the following basic metrics:
> >>>>>>> 1) Execution count for particular query
> >>>>>>> 2) Number of map nodes - min, max, avg
> >>>>>>>
> >>>>>>
> >>>>>> (1) and (2) makes sense
> >>>>>>
> >>>>>>
> >>>>>>> 3) Map step duration (if applicable) - min, max,
> >>>>>>
> >>>>>> 4) Reduce step duration (if applicable) - min, max, avg
> >>>>>>>
> >>>>>>
> >>>>>> Not sure if (3) and (4) are needed. I would only add them if
they
> are
> >>>>> easy
> >>>>>> to implement.
> >>>>>>
> >>>>>> I would also add these:
> >>>>>>
> >>>>>> 5) Collocated: yes/no
> >>>>>> 6) last execution time
> >>>>>> 7) min/max/average execution duration
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> Once done users will be able to get statistics for particular
> >> queries.
> >>>>>>>
> >>>>>>> Vladimir.
> >>>>>>>
> >>>>>>>
> >>>>>>> On Tue, Feb 28, 2017 at 3:12 AM, Denis Magda <dmagda@apache.org>
> >>> wrote:
> >>>>>>>
> >>>>>>>> BTW,
> >>>>>>>>
> >>>>>>>> What if we expose per-query metrics below as a part
of EXPLAIN
> >>> ANALYZE?
> >>>>>>>> Sergi, is this feasible?
> >>>>>>>>
> >>>>>>>> —
> >>>>>>>> Denis
> >>>>>>>>
> >>>>>>>>> On Feb 27, 2017, at 2:35 PM, Denis Magda <dmagda@apache.org>
> >> wrote:
> >>>>>>>>>
> >>>>>>>>> Igniters,
> >>>>>>>>>
> >>>>>>>>> Let’s shed more light on SQL query execution internals
> >> introducing a
> >>>>>>> set
> >>>>>>>> of useful metrics (https://issues.apache.org/
> >> jira/browse/IGNITE-4757
> >>> ).
> >>>>>>>>>
> >>>>>>>>> Per-query metrics. Total history size is defined
by
> >>>>>>> *CacheConfiguration.
> >>>>>>>> getQueryDetailMetricsSize*:
> >>>>>>>>> * if a query was executed in the collocated or non-collocated
> >> mode.
> >>>>>>>> Three results are valid: collocated, non-collocated,
simple query
> >> (no
> >>>>>>>> joins).
> >>>>>>>>> * non-collocated query: size of the data exchanged
between the
> >> nodes
> >>>>> to
> >>>>>>>> complete a join.
> >>>>>>>>> * non-collocated query: did a query do broadcast
or unicast to
> get
> >>>>> data
> >>>>>>>> needed to complete a join.
> >>>>>>>>> * non-collocated and collocated query: a part of
the time spent
> >>>>> joining
> >>>>>>>> the data.
> >>>>>>>>>
> >>>>>>>>> CacheMetrics:
> >>>>>>>>> * an average number of executed SQL queries (collocated,
> >>>>>>> non-collocated,
> >>>>>>>> simple query (no joins)).
> >>>>>>>>>
> >>>>>>>>> Please don’t hesitate do share suggest another
metrics or improve
> >>>>>>>> proposed ones.
> >>>>>>>>>
> >>>>>>>>> —
> >>>>>>>>> Denis
> >>>>>
> >>>>>
> >>>
> >>>
> >>
> >>
> >> --
> >> Sergey Kozlov
> >> GridGain Systems
> >> www.gridgain.com
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message