ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Setrakyan <dsetrak...@apache.org>
Subject Re: Ignite not friendly for Monitoring
Date Tue, 16 Jan 2018 20:30:37 GMT
Assigned the version 2.5 to the ticket. Let's try to make progress on this
before then.

On Tue, Jan 16, 2018 at 12:03 PM, Denis Magda <dmagda@apache.org> wrote:

> Serge,
>
> Thanks for taking over this. Think we’re moving in a right direction with
> your proposal:
>
> * I would add a top-level domain for “Integrations”. All the integrations
> with Kafka, Spark, Storm, etc. should go there.
>
> * Second-level domains number can grow over the time per a top-level
> layer. Let’s book a decent range for this possible grow.
>
> * Guess external adapters should go to the “Integrations” which sounds
> better to me.
>
> * Agree that this ticket should be used to track the progress in JIRA:
> https://issues.apache.org/jira/browse/IGNITE-3690 <
> https://issues.apache.org/jira/browse/IGNITE-3690>
>
>
> On top of this, this effort has to be tested using a 3rd party tool such
> as DynoTrace or Nagios. If the tools can pick up and analyze our logs to
> automate classic DevOps tasks then the goal will be achieved. Can you
> include this as a required task for QA?
>
> —
> Denis
>
> > On Jan 15, 2018, at 7:48 AM, Serge Puchnin <sergey.puchnin@gmail.com>
> wrote:
> >
> > Igniters,
> >
> > It's a right idea!
> >
> > Let's try to revitalize it and make a move on.
> >
> > As a first step, I would like to propose a list of a top-level domain.
> >
> > -- the phase 1
> >    1. UnExpected, UnKnown
> >    2. Cluster and Topology
> >        Discovery
> >        Segmentation
> >        Node Startup
> >        Communication
> >        Queue
> >        Activate, startup process
> >        Base line topology
> >        Marshaller
> >        Metadata
> >        Topology Validate
> >    3. Cache and Storage
> >        Partition map exchange
> >        Balancing
> >        Long-running transactions
> >        Checkpoint
> >        Create cache
> >        Destroy cache
> >        Data loading & streaming
> >    4. SQL
> >        Long-running queries
> >        Parsing
> >        Queries
> >        Scan Queries
> >        SqlLine
> >    5. Compute
> >        Deployment
> >        spi.checkpoint
> >        spi.collision
> >        Job Schedule
> >
> > -- the phase 2
> >    6. Service
> >    7. Security
> >    8. ML
> >    9. External Adapters
> >    10. WebConsole
> >    11. Vendor Specific
> >        GG
> >
> >
> > For every second-level domain is planning to reserve one hundred error
> > codes. Sum of second-level domains (rounded up to next thousand) gives us
> > count for top-level.
> >
> > Every error code has a severity level:
> >
> > Critical (Red) - the system is not operational;
> > Warning (Yellow) - the system is operational but health is degraded;
> > Info - just an info.
> >
> > And two or three letter prefix. It allows to find an issue more easily
> > without complex grep rules (something like grep
> > "10[2][5-9][0-5][0-9]|10[3][0-5][0-6][0-9]" * to find codes between
> 102500
> > до 103569)
> >
> >
> > Domains from the first phase look fine but from the second are vague.
> > Initially, we can focus only the first phase.
> >
> > Please share your thoughts on proposed design.
> >
> > Serge.
> >
> >
> >
> > --
> > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message