[ https://issues.apache.org/jira/browse/IGNITE-9135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sergey Kosarev updated IGNITE-9135:
-----------------------------------
Description:
On High topology (about 200 servers/ 50 clients) we see often via jmx (TcpDiscoverySpiMBean)
high MessageWorkerQueueSize peaks (>100) in stable cluster topology. Also very high number
(about 250000) of ProcesedMessages, ReceivedMessages for TcpDiscoveryStatusCheckMessage, whereas
TcpDiscoveryMetricsUpdateMessage is about 110000.
Actually [org.apache.ignite.spi.discovery.tcp.ServerImpl.RingMessageWorker#metricsCheckFreq|https://github.com/apache/ignite/blob/5faffcee7cfaae90d3093e624d27f1b69554ea10/modules/core/src/main/java/org/apache/ignite/spi/discovery/tcp/ServerImpl.java#L2560]
value does not depend on topology size:
private long metricsCheckFreq = 3 * spi.metricsUpdateFreq + 50;
(developed in IGNITE-4799)
Why dow we have such peaks (see screenshots) on stable topology?
Consider changing metricsCheckFreq formula to depend on topology size.
was:
On High topology (about 200 servers/ 50 clients) we see often via jmx (TcpDiscoverySpiMBean)
high MessageWorkerQueueSize peaks (>100) in stable cluster topology. Also very high number
(about 250000) of ProcesedMessages, ReceivedMessages for TcpDiscoveryStatusCheckMessage, whereas
TcpDiscoveryMetricsUpdateMessage is about 110000.
Actually [org.apache.ignite.spi.discovery.tcp.ServerImpl.RingMessageWorker#metricsCheckFreq|https://github.com/apache/ignite/blob/5faffcee7cfaae90d3093e624d27f1b69554ea10/modules/core/src/main/java/org/apache/ignite/spi/discovery/tcp/ServerImpl.java#L2560]
value does not depend on topology size:
private long metricsCheckFreq = 3 * spi.metricsUpdateFreq + 50;
Why dow we have such peaks (see screenshots) on stable topology?
Consider changing metricsCheckFreq formula to depend on topology size.
> TcpDiscovery - High Workload in Stable topology (MessageWorkerQueueSize peaks)
> ------------------------------------------------------------------------------
>
> Key: IGNITE-9135
> URL: https://issues.apache.org/jira/browse/IGNITE-9135
> Project: Ignite
> Issue Type: Bug
> Reporter: Sergey Kosarev
> Priority: Major
> Attachments: IMG_20180731_014146_HDR.jpg, IMG_20180731_015439_HDR.jpg
>
>
> On High topology (about 200 servers/ 50 clients) we see often via jmx (TcpDiscoverySpiMBean)
high MessageWorkerQueueSize peaks (>100) in stable cluster topology. Also very high number
(about 250000) of ProcesedMessages, ReceivedMessages for TcpDiscoveryStatusCheckMessage, whereas
TcpDiscoveryMetricsUpdateMessage is about 110000.
> Actually [org.apache.ignite.spi.discovery.tcp.ServerImpl.RingMessageWorker#metricsCheckFreq|https://github.com/apache/ignite/blob/5faffcee7cfaae90d3093e624d27f1b69554ea10/modules/core/src/main/java/org/apache/ignite/spi/discovery/tcp/ServerImpl.java#L2560]
value does not depend on topology size:
> private long metricsCheckFreq = 3 * spi.metricsUpdateFreq + 50;
> (developed in IGNITE-4799)
> Why dow we have such peaks (see screenshots) on stable topology?
> Consider changing metricsCheckFreq formula to depend on topology size.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
|