ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Ozerov <voze...@gridgain.com>
Subject Re: proposed design for thin client SQL management and monitoring (view running queries and kill it)
Date Wed, 28 Nov 2018 14:34:10 GMT
Denis

I would wait for running queries view first.

ср, 28 нояб. 2018 г. в 1:57, Denis Magda <dmagda@apache.org>:

> Vladimir,
>
> Please see inline
>
> On Mon, Nov 19, 2018 at 8:23 AM Vladimir Ozerov <vozerov@gridgain.com>
> wrote:
>
> > Denis,
> >
> > I partially agree with you. But there are several problem with syntax
> > proposed by you:
> > 1) This is harder to implement technically - more parsing logic to
> > implement. Ok, this is our internal problem, users do not care about it
> > 2) User will have to consult to docs in any case
> >
>
> Two of these are not a big deal. We just need to invest more time for
> development and during the design phase so that people need to consult the
> docs rarely.
>
>
> > 3) "nodeId" is not really node ID. For Ignite users node ID is UUID. In
> our
> > case this is node order, and we intentionally avoided any naming here.
> >
>
> Let's use a more loose name such as "node".
>
>
> > Query is just identified by a string, no more than that
> > 4) Proposed syntax is more verbose and open ways for misuse. E.g. what is
> > "KILL QUERY WHERE queryId=1234"?
> >
> > I am not 100% satisfied with both variants, but the first one looks
> simpler
> > to me. Remember, that user will not guess query ID. Instead, he will get
> > the list of running queries with some other syntax. What we need to
> > understand for now is how this syntax will look like. I think that we
> > should implement getting list of running queries, and only then start
> > working on cancellation.
> >
>
> That's a good point. Syntax of both running and killing queires commands
> should be tightly coupled. We're going to name a column if running queries
> IDs somehow anyway and that name might be resued in the WHERE clause of
> KILL.
>
> Should we discuss the syntax in a separate thread?
>
> --
> Denis
>
> >
> > Vladimir.
> >
> >
> > On Mon, Nov 19, 2018 at 7:02 PM Denis Mekhanikov <dmekhanikov@gmail.com>
> > wrote:
> >
> > > Guys,
> > >
> > > Syntax like *KILL QUERY '25.1234'* look a bit cryptic to me.
> > > I'm going to look up in documentation, which parameter goes first in
> this
> > > query every time I use it.
> > > I like the syntax, that Igor suggested more.
> > > Will it be better if we make *nodeId* and *queryId *named properties?
> > >
> > > Something like this:
> > > KILL QUERY WHERE nodeId=25 and queryId=1234
> > >
> > > Denis
> > >
> > > пт, 16 нояб. 2018 г. в 14:12, Юрий <jury.gerzhedowich@gmail.com>:
> > >
> > > > I fully agree with last sentences and can start to implement this
> part.
> > > >
> > > > Guys, thanks for your productive participate at discussion.
> > > >
> > > > пт, 16 нояб. 2018 г. в 2:53, Denis Magda <dmagda@apache.org>:
> > > >
> > > > > Vladimir,
> > > > >
> > > > > Thanks, make perfect sense to me.
> > > > >
> > > > >
> > > > > On Thu, Nov 15, 2018 at 12:18 AM Vladimir Ozerov <
> > vozerov@gridgain.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Denis,
> > > > > >
> > > > > > The idea is that QueryDetailMetrics will be exposed through
> > separate
> > > > > > "historical" SQL view in addition to current API. So we are
on
> the
> > > same
> > > > > > page here.
> > > > > >
> > > > > > As far as query ID I do not see any easy way to operate on a
> single
> > > > > integer
> > > > > > value (even 64bit). This is distributed system - we do not want
> to
> > > have
> > > > > > coordination between nodes to get query ID. And coordination
is
> the
> > > > only
> > > > > > possible way to get sexy "long". Instead, I would propose to
form
> > ID
> > > > from
> > > > > > node order and query counter within node. This will be (int,
> long)
> > > > pair.
> > > > > > For use convenience we may convert it to a single string, e.g.
> > > > > > "[node_order].[query_counter]". Then the syntax would be:
> > > > > >
> > > > > > KILL QUERY '25.1234'; // Kill query 1234 on node 25
> > > > > > KILL QUERY '25.*;     // Kill all queries on the node 25
> > > > > >
> > > > > > Makes sense?
> > > > > >
> > > > > > Vladimir.
> > > > > >
> > > > > > On Wed, Nov 14, 2018 at 1:25 PM Denis Magda <dmagda@apache.org>
> > > wrote:
> > > > > >
> > > > > > > Yury,
> > > > > > >
> > > > > > > As I understand you mean that the view should contains
both
> > running
> > > > and
> > > > > > > > finished queries. If be honest for the view I was
going to
> use
> > > just
> > > > > > > queries
> > > > > > > > running right now. For finished queries I thought
about
> another
> > > > view
> > > > > > with
> > > > > > > > another set of fields which should include I/O related
ones.
> Is
> > > it
> > > > > > works?
> > > > > > >
> > > > > > >
> > > > > > > Got you, so if only running queries are there then your
initial
> > > > > proposal
> > > > > > > makes total sense. Not sure we need a view of the finished
> > queries.
> > > > It
> > > > > > will
> > > > > > > be possible to analyze them through the updated DetailedMetrics
> > > > > approach,
> > > > > > > won't it?
> > > > > > >
> > > > > > > For "KILL QUERY node_id query_id"  node_id required as
part of
> > > unique
> > > > > key
> > > > > > > > of query and help understand Ignite which node start
the
> > > > distributed
> > > > > > > query.
> > > > > > > > Use both parameters will allow cheap generate unique
key
> across
> > > all
> > > > > > > nodes.
> > > > > > > > Node which started a query can cancel it on all nodes
> > participate
> > > > > > nodes.
> > > > > > > > So, to stop any queries initially we need just send
the
> cancel
> > > > > request
> > > > > > to
> > > > > > > > node who started the query. This mechanism is already
in
> > Ignite.
> > > > > > >
> > > > > > >
> > > > > > > Can we locate node_id behind the scenes if the user supplies
> > > query_id
> > > > > > only?
> > > > > > > A query record in the view already contains query_id and
> node_id
> > > and
> > > > it
> > > > > > > sounds like an extra work for the user to fill in all the
> details
> > > for
> > > > > us.
> > > > > > > Embed node_id into query_id if you'd like to avoid extra
> network
> > > hops
> > > > > for
> > > > > > > query_id to node_id mapping.
> > > > > > >
> > > > > > > --
> > > > > > > Denis
> > > > > > >
> > > > > > > On Wed, Nov 14, 2018 at 1:04 AM Юрий <
> > jury.gerzhedowich@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Denis,
> > > > > > > >
> > > > > > > > Under the hood 'time' will be as startTime, but for
system
> > view I
> > > > > > planned
> > > > > > > > use duration which will be simple calculated as now
-
> > startTime.
> > > > So,
> > > > > > > there
> > > > > > > > is't a performance issue.
> > > > > > > > As I understand you mean that the view should contains
both
> > > running
> > > > > and
> > > > > > > > finished queries. If be honest for the view I was
going to
> use
> > > just
> > > > > > > queries
> > > > > > > > running right now. For finished queries I thought
about
> another
> > > > view
> > > > > > with
> > > > > > > > another set of fields which should include I/O related
ones.
> Is
> > > it
> > > > > > works?
> > > > > > > >
> > > > > > > > For "KILL QUERY node_id query_id"  node_id required
as part
> of
> > > > unique
> > > > > > key
> > > > > > > > of query and help understand Ignite which node start
the
> > > > distributed
> > > > > > > query.
> > > > > > > > Use both parameters will allow cheap generate unique
key
> across
> > > all
> > > > > > > nodes.
> > > > > > > > Node which started a query can cancel it on all nodes
> > participate
> > > > > > nodes.
> > > > > > > > So, to stop any queries initially we need just send
the
> cancel
> > > > > request
> > > > > > to
> > > > > > > > node who started the query. This mechanism is already
in
> > Ignite.
> > > > > > > >
> > > > > > > > Native SQL APIs will automatically support the futures
after
> > > > > > implementing
> > > > > > > > for thin clients. So we are good here.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > вт, 13 нояб. 2018 г. в 18:52, Denis Magda
<dmagda@apache.org
> >:
> > > > > > > >
> > > > > > > > > Yury,
> > > > > > > > >
> > > > > > > > > Please consider the following:
> > > > > > > > >
> > > > > > > > >    - If we record the duration instead of startTime,
then
> the
> > > > > former
> > > > > > > has
> > > > > > > > to
> > > > > > > > >    be updated frequently - sounds like a performance
red
> > flag.
> > > > > Should
> > > > > > > we
> > > > > > > > > store
> > > > > > > > >    startTime and endTime instead? This way a
query record
> > will
> > > be
> > > > > > > updated
> > > > > > > > >    twice - when the query is started and terminated.
> > > > > > > > >    - In the IEP you've mentioned I/O related
fields that
> > should
> > > > > help
> > > > > > to
> > > > > > > > >    grasp why a query runs that slow. Should they
be stored
> in
> > > > this
> > > > > > > view?
> > > > > > > > >    - "KILL QUERY query_id" is more than enough.
Let's not
> add
> > > > > > "node_id"
> > > > > > > > >    unless it's absolutely required. Our queries
are
> > distributed
> > > > and
> > > > > > > > > executed
> > > > > > > > >    across several nodes that's why the node_id
parameter is
> > > > > > redundant.
> > > > > > > > >    - This API needs to be supported across all
our
> > interfaces.
> > > We
> > > > > can
> > > > > > > > start
> > > > > > > > >    with JDBC/ODBC and thin clients and then support
for the
> > > > native
> > > > > > SQL
> > > > > > > > APIs
> > > > > > > > >    (Java, Net, C++)
> > > > > > > > >    - Please share examples of SELECTs in the
IEP that would
> > > show
> > > > > how
> > > > > > to
> > > > > > > > >    find long running queries, queries that cause
a lot of
> I/O
> > > > > > troubles.
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Denis
> > > > > > > > >
> > > > > > > > > On Tue, Nov 13, 2018 at 1:15 AM Юрий <
> > > > jury.gerzhedowich@gmail.com>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Igniters,
> > > > > > > > > >
> > > > > > > > > > Some comments for my original email's.
> > > > > > > > > >
> > > > > > > > > > The proposal related to part of IEP-29
> > > > > > > > > > <
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring
> > > > > > > > > > >
> > > > > > > > > > .
> > > > > > > > > >
> > > > > > > > > > What purpose are we pursuing of the proposal?
> > > > > > > > > > We want to be able check which queries running
right now
> > > > through
> > > > > > thin
> > > > > > > > > > clients. Get some information related to
the queries and
> be
> > > > able
> > > > > to
> > > > > > > > > cancel
> > > > > > > > > > a query if it required for some reasons.
> > > > > > > > > > So, we need interface to get a running queries.
For the
> > goal
> > > we
> > > > > > > propose
> > > > > > > > > > running_queries system view. The view contains
unique
> query
> > > > > > > identifier
> > > > > > > > > > which need to pass to kill query command
to cancel the
> > query.
> > > > > > > > > >
> > > > > > > > > > What do you think about fields of the running
queries
> view?
> > > May
> > > > > be
> > > > > > > some
> > > > > > > > > > useful fields we could easy add to the view.
> > > > > > > > > >
> > > > > > > > > > Also let's discuss syntax of cancellation
of query. I
> > propose
> > > > to
> > > > > > use
> > > > > > > > > MySQL
> > > > > > > > > > like syntax as easy to understand and shorter
then Oracle
> > and
> > > > > > > Postgres
> > > > > > > > > > syntax ( detailed information in IEP-29
> > > > > > > > > > <
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring
> > > > > > > > > > >
> > > > > > > > > > ).
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > пн, 12 нояб. 2018 г. в 19:28, Юрий
<
> > > > jury.gerzhedowich@gmail.com
> > > > > >:
> > > > > > > > > >
> > > > > > > > > > > Igniters,
> > > > > > > > > > >
> > > > > > > > > > > Below is a proposed design for thin
client SQL
> management
> > > and
> > > > > > > > > monitoring
> > > > > > > > > > > to cancel a queries.
> > > > > > > > > > >
> > > > > > > > > > > 1) Ignite expose system SQL view with
name
> > > *running_queries*
> > > > > > > > > > > proposed columns: *node_id, query_id,
sql, schema_name,
> > > > > > > > connection_id,
> > > > > > > > > > > duration*.
> > > > > > > > > > >
> > > > > > > > > > > node_id - initial node of request
> > > > > > > > > > > query_id - unique id of query on node
> > > > > > > > > > > sql - text of query
> > > > > > > > > > > schema name - name of sql schema
> > > > > > > > > > > connection_id - id of client connection
from
> > > > > > > > > > ClientListenerConnectionContext
> > > > > > > > > > > class
> > > > > > > > > > > duration - duration in millisecond
from start of query
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Ignite will gather info about running
queries from each
> > of
> > > > > nodes
> > > > > > > and
> > > > > > > > > > > collect it during user query. We already
have most of
> the
> > > > > > > information
> > > > > > > > > at
> > > > > > > > > > GridRunningQueryInfo
> > > > > > > > > > > on each of nodes.
> > > > > > > > > > >
> > > > > > > > > > > Instead of duration we can use start_time,
but I think
> > > > duration
> > > > > > > will
> > > > > > > > be
> > > > > > > > > > > simple to use due to it not depend
on a timezone.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > 2) Propose to use following syntax
to kill a running
> > query:
> > > > > > > > > > >
> > > > > > > > > > > *KILL QUERY node_Id query_id*
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Both parameters node_id and query_id
can be get through
> > > > > > > > running_queries
> > > > > > > > > > > system view.
> > > > > > > > > > >
> > > > > > > > > > > When a node receive such request it
can be run locally
> in
> > > > case
> > > > > > node
> > > > > > > > > have
> > > > > > > > > > > given node_id or send message to node
with given id.
> > > Because
> > > > > node
> > > > > > > > have
> > > > > > > > > > > information about local running queries
then can cancel
> > it
> > > -
> > > > it
> > > > > > > > already
> > > > > > > > > > > implemented in
> > GridReduceQueryExecutor.cancelQueries(qryId)
> > > > > > method.
> > > > > > > > > > >
> > > > > > > > > > > Comments are welcome.
> > > > > > > > > > > --
> > > > > > > > > > > Живи с улыбкой! :D
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Живи с улыбкой! :D
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Живи с улыбкой! :D
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Живи с улыбкой! :D
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message