ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Plehanov <plehanov.a...@gmail.com>
Subject Re: Ignite diagnostic (SQL system views)
Date Thu, 25 Jan 2018 13:27:45 GMT
Anton, Vladimir, I've made some fixes. There is only one view left and it's
renamed to 'IGNITE.LOCAL_TRANSACTIONS'.

High level design of solution:
When IgniteH2Indexing is starting, it create and start
new GridH2SysViewProcessor, which create and register in H2 (via its own
table engine) all implementations of system views. Each system view
implementation extends base abstract class GridH2SysView. View
implementation describes columns, their types and indexes in constructor
and must override method getRows for data retrieval (this method called by
H2-compatible table and index implementations for ignite system views).
Almost no fixes to existing parsing engine was made, except some places,
where GridH2Table instance was expected, but for system views there is
another class.

New PR: [1].  Please have a look.

[1] https://github.com/apache/ignite/pull/3433

2018-01-24 19:12 GMT+03:00 Anton Vinogradov <avinogradov@gridgain.com>:

> I've created IEP-13 [1] to cover all cases.
> Feel free to create issues.
>
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75962769
>
> On Wed, Jan 24, 2018 at 6:10 PM, Vladimir Ozerov <vozerov@gridgain.com>
> wrote:
>
> > Let's start with a single and the most simple view, e.g.
> > LOCAL_TRANSACTIONS. We will review and merge it along with necessary
> > infrastructure. Then will handle the rest view in separate tickets and
> > separate focused discussions.
> >
> > On Wed, Jan 24, 2018 at 5:29 PM, Alex Plehanov <plehanov.alex@gmail.com>
> > wrote:
> >
> > > 1) It’s not a principal point, I can change schema. The
> > INFORMATION_SCHEMA
> > > was used because it’s already exists and usually used for metadata
> tables
> > > and views. Your proposal is to use schema “IGNITE”, am I understand you
> > > right? BTW, for now, we can’t query another (H2) meta tables from the
> > > INFORMATION_SCHEMA, so, “Ignite system views” is only available views
> to
> > > query from this schema.
> > > 2) Exactly for this reason the IGNITE_INSTANCE view is useful: to
> > determine
> > > which node we are connected to.
> > > 3) As the first phase, in my opinion, local views will be enough.
> > > Performance and caching of distributed views should be discussed at
> next
> > > phases, when distributed views implementation will be planned. In
> current
> > > implementation I tried to use indexing for local views wherever it’s
> > > possible.
> > > 4) I don’t think, that JVM info is more critical information than, for
> > > example, caches or nodes information. When authorization capabilities
> > > planned to implement?
> > >
> > > About local data: yes, we can rename all currently implemented views
> for
> > > the local node data as LOCAL_..., and create (someday) new whole
> cluster
> > > views (which use distributed requests) without prefix or, for example,
> > with
> > > CLUSTER_ prefix. But some views can show all cluster information using
> > only
> > > local node data, without distributed requests (for example
> > > IGNITE_NODE_METRICS, IGNITE_PART_ASSIGNMENT, IGNITE_PART_ALLOCATION,
> > > IGNITE_NODES, etc). Are they local or cluster views in this concept?
> > Which
> > > prefix should be used? And what about caches? Are they local or
> cluster?
> > On
> > > local node we can see cluster wide caches (replicated and distributed)
> > and
> > > caches for current node only. Local caches list may differ from node to
> > > node. Which prefix should be used for this view? And one more, there is
> > no
> > > sense for some views to make them cluster wide (for example
> > > INGNITE_INSTANCE). Should we name it LOCAL_INSTANCE without creating
> > > INSTANCE view?
> > >
> > > So, next steps: split PR, change schema name (IGNITE?), change view
> name
> > > for caches (CACHES, LOCAL_CACHES?)
> > >
> > >
> > > 2018-01-24 13:03 GMT+03:00 Vladimir Ozerov <vozerov@gridgain.com>:
> > >
> > > > Hi Alex,
> > > >
> > > > System views could be extremely valuable addition for Ignite.
> Ideally,
> > > user
> > > > should be able to monitor and manage state of the whole cluster with
> a
> > > > single SQL command line. We have plans to implement it for a very
> long
> > > > time. However, this is very sensitive task which should take a lot of
> > > > moving pieces in count, such as usability, consistency, performance,
> > > > security, etc..
> > > >
> > > > Let me point several major concerns I see at the moment:
> > > >
> > > > 1) Usability: INFORMATION_SCHEMA
> > > > This schema is part of SQL ANSI standard. When creating system views,
> > > some
> > > > vendors prefer to store them in completely different predefined
> schema
> > > > (Oracle, MS SQL). Others prefer to keep them in INFORMATION_SCHEMA
> > > > directly. Both approaches could work. However, the latter breaks
> > > separation
> > > > of concerns - we store typical metadata near to possibly sensitive
> > system
> > > > data. Also it makes security management more complex - system data is
> > > very
> > > > sensitive, and now we cannot simply grant access INFORMATIONAL_SCHEMA
> > to
> > > > user. Instead, we have to grant that access on per-view basis. For
> this
> > > > reason my preference is to store system tables in separate schema,
> not
> > in
> > > > INFORMATION_SCHEMA
> > > >
> > > > 2) Consistency: local data
> > > > One of implemented view GridH2SysViewImplInstance. Normally SQL users
> > > > communicate with Ignite through JDBC/ODBC drivers. These drivers are
> > > > connected to a single node, typically client node. Moreover, we will
> > > > introduce high-availability feature when drivers were able to connect
> > to
> > > > any address from a predefined list. It renders this view useless, as
> > you
> > > do
> > > > not know which node you connected to. Also, local-only data cannot be
> > > > joined in general case - you will receive different results on
> > different
> > > > nodes. The same goes for transactions, JVM info, etc.
> > > >
> > > > 3) Performance
> > > > Suppose we fixed consistency of transactions and now this view shows
> > > > transactions in the whole cluster with possibility to filter them by
> > > nodes
> > > > - this is what user would expect out of the box. Another problem
> > appears
> > > > then - performance. How would we collect necessary data? How would we
> > > > handle joins, when particular view could be scanned multiple times
> > during
> > > > query execution? How we achieve sensible consistency? Most probably
> we
> > > > would collect remote data once when query is started, cache it
> somehow
> > on
> > > > query session level, and then re-use during joins. But again, this
> > should
> > > > be discussed separately.
> > > >
> > > > 4) Security: JVM info
> > > > We should define clear boundaries of what info is exposed. JVM data
> > along
> > > > with running threads is critically sensitive information. We should
> not
> > > > expose it until we have authorization capabilities.
> > > >
> > > > In order to start moving this code from prototype to production state
> > we
> > > > should start with the most simple and consistent views. E.g.
> > > IGNITE_CACHES.
> > > > Let's move it to a separate PR, review infrastructure code, review
> view
> > > > implementation, agree on proper naming and placement, and merge it.
> > Then
> > > > each and every view (or group of related views) should be discussed
> and
> > > > reviewed separately.
> > > >
> > > > As far as node-local stuff, may be we should move it to a separate
> > > schema,
> > > > or mark with special prefix. E.g. "IGNITE.TRANSACTIONS" - all
> > > transactions
> > > > in the cluster, "IGNITE.LOCAL_TRANSACTIONS" - transactions on the
> local
> > > > node. In this case we will be able to merge "local" stuff shortly,
> and
> > > > implement more complex but at the same time much more useful
> > distributed
> > > > stuff later on.
> > > >
> > > > Makes sense?
> > > >
> > > > Vladimir.
> > > >
> > > >
> > > > On Tue, Jan 23, 2018 at 8:30 PM, Alex Plehanov <
> > plehanov.alex@gmail.com>
> > > > wrote:
> > > >
> > > > > Hello, Igniters!
> > > > >
> > > > > For Ignite diagnostic usually it’s helpful to get some Ignite
> > internals
> > > > > information. But currently, in my opinion, there are no convenient
> > > tools
> > > > > for this purpose:
> > > > >
> > > > > ·        Some issues can be solved by analyzing log files. Log
> files
> > > are
> > > > > useful for dumps, but sometimes they are difficult to read. Also
> > > > > interesting metrics can’t be received runtime by request, we need
> to
> > > wait
> > > > > until Ignite will write these metrics by timeout or other events.
> > > > >
> > > > > ·        JMX is useful for scalar metrics. Complex and table data
> can
> > > > also
> > > > > be received, but it’s difficult to read, filter and sort them
> without
> > > > > processing by specialized external tools. For most frequently used
> > > cases
> > > > > almost duplicating metrics are created to show data in an
> > easy-to-read
> > > > > form.
> > > > >
> > > > > ·        Web-console is able to show table and complex data.
> Perhaps,
> > > > > someday  web-console will contain all necessary dashboards for most
> > > > problem
> > > > > investigation, but some non-trivial queries will not be covered
> > anyway.
> > > > > Also web-console needs additional infrastructure to work.
> > > > >
> > > > > ·        External “home-made” tools can be used for non-trivial
> > cases.
> > > > They
> > > > > cover highly specialized cases and usually can’t be used as general
> > > > purpose
> > > > > tools.
> > > > >
> > > > > Sometimes we are forced to use more than one tool and join data by
> > > hands
> > > > > (for example, current thread dump and data from logs).
> > > > >
> > > > > Often RDBMS for diagnostic purposes provides system views (for
> > example,
> > > > > DBA_% and V$% in Oracle), which can be queried by SQL. This
> solution
> > > > makes
> > > > > all internal diagnostic information available in a readable form
> > (with
> > > > all
> > > > > possible filters and projections) without using any other internal
> or
> > > > > external tools. My proposal is to create similar system views in
> > > Ignite.
> > > > >
> > > > > I implement working prototype (PR: [1]). It contains views:
> > > > >
> > > > > IGNITE_SYSTEM_VIEWS
> > > > >
> > > > > Registered system views
> > > > >
> > > > > IGNITE_INSTANCE
> > > > >
> > > > > Ignite instance
> > > > >
> > > > > IGNITE_JVM_THREADS
> > > > >
> > > > > JVM threads
> > > > >
> > > > > IGNITE_JVM_RUNTIME
> > > > >
> > > > > JVM runtime
> > > > >
> > > > > IGNITE_JVM_OS
> > > > >
> > > > > JVM operating system
> > > > >
> > > > > IGNITE_CACHES
> > > > >
> > > > > Ignite caches
> > > > >
> > > > > IGNITE_CACHE_CLUSTER_METRICS
> > > > >
> > > > > Ignite cache cluster metrics
> > > > >
> > > > > IGNITE_CACHE_NODE_METRICS
> > > > >
> > > > > Ignite cache node metrics
> > > > >
> > > > > IGNITE_CACHE_GROUPS
> > > > >
> > > > > Cache groups
> > > > >
> > > > > IGNITE_NODES
> > > > >
> > > > > Nodes in topology
> > > > >
> > > > > IGNITE_NODE_HOSTS
> > > > >
> > > > > Node hosts
> > > > >
> > > > > IGNITE_NODE_ADDRESSES
> > > > >
> > > > > Node addresses
> > > > >
> > > > > IGNITE_NODE_ATTRIBUTES
> > > > >
> > > > > Node attributes
> > > > >
> > > > > IGNITE_NODE_METRICS
> > > > >
> > > > > Node metrics
> > > > >
> > > > > IGNITE_TRANSACTIONS
> > > > >
> > > > > Active transactions
> > > > >
> > > > > IGNITE_TRANSACTION_ENTRIES
> > > > >
> > > > > Cache entries used by transaction
> > > > >
> > > > > IGNITE_TASKS
> > > > >
> > > > > Active tasks
> > > > >
> > > > > IGNITE_PART_ASSIGNMENT
> > > > >
> > > > > Partition assignment map
> > > > >
> > > > > IGNITE_PART_ALLOCATION
> > > > >
> > > > > Partition allocation map
> > > > >
> > > > >
> > > > >
> > > > > There are much more useful views can be implemented (executors
> > > > diagnostic,
> > > > > SPIs diagnostic, etc).
> > > > >
> > > > > Some usage examples:
> > > > >
> > > > > Cache groups and their partitions, which used by transaction more
> > than
> > > 5
> > > > > minutes long:
> > > > >
> > > > > SELECT cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION, count(*) AS
> > > ENTITIES_CNT
> > > > > FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t
> > > > > JOIN INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON t.XID =
> > > te.XID
> > > > > JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON te.CACHE_NAME = c.NAME
> > > > > JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg ON c.GROUP_ID =
> cg.ID
> > > > > WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5, NOW())
> > > > > GROUP BY cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION
> > > > >
> > > > >
> > > > >
> > > > > Average CPU load on server nodes grouped by operating system:
> > > > >
> > > > > SELECT na.VALUE, COUNT(n.ID), AVG(nm.AVG_CPU_LOAD) AVG_CPU_LOAD
> > > > > FROM INFORMATION_SCHEMA.IGNITE_NODES n
> > > > > JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES na ON na.NODE_ID =
> > n.ID
> > > > AND
> > > > > na.NAME = 'os.name'
> > > > > JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm ON nm.NODE_ID =
> n.ID
> > > > > WHERE n.IS_CLIENT = false
> > > > > GROUP BY na.VALUE
> > > > >
> > > > >
> > > > >
> > > > > Top 5 nodes by puts to cache ‘cache’:
> > > > >
> > > > > SELECT cm.NODE_ID, cm.CACHE_PUTS FROM
> > > > > INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm
> > > > > WHERE cm.CACHE_NAME = 'cache'
> > > > > ORDER BY cm.CACHE_PUTS DESC
> > > > > LIMIT 5
> > > > >
> > > > >
> > > > >
> > > > > Does this implementation interesting to someone else? Maybe any
> views
> > > are
> > > > > redundant? Which additional first-priority views must be
> implemented?
> > > Any
> > > > > other thoughts or proposal?
> > > > >
> > > > > [1] https://github.com/apache/ignite/pull/3413
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message