ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ksenia Rybakova (JIRA)" <j...@apache.org>
Subject [jira] [Created] (IGNITE-5707) Client can't resume streaming even after topology got stable during load test
Date Thu, 06 Jul 2017 11:43:00 GMT
Ksenia Rybakova created IGNITE-5707:
---------------------------------------

             Summary: Client can't resume streaming even after topology got stable during
load test
                 Key: IGNITE-5707
                 URL: https://issues.apache.org/jira/browse/IGNITE-5707
             Project: Ignite
          Issue Type: Bug
    Affects Versions: 2.1
            Reporter: Ksenia Rybakova


Load test config:
- CacheRandomOperationBenchmark
- 8 clients, 48 servers at 8 hosts
- 26 physical caches of different types with different memory policies + 30 groups with 10
partitioned caches each + 20 groups with 10 replicated caches each. Total 526 caches.
- Preloading amount: 50K, key range: 60K
Complete configs are attached.

3 of 8 clients have following messages during preloading:
{noformat}
[12:17:56] (err) Failed to execute compound future reducer: GridCompoundFuture [rdc=null,
initFlag=1, lsnrCalls=0, done=false, cancelled=false, err=null, futs=[true, false, false]][12:17:56]
(err) Failed to
execute compound future reducer: GridCompoundFuture [rdc=null, initFlag=1, lsnrCalls=0, done=false,
cancelled=false, err=null, futs=[true, true, false, false, false, false, false, false, false,
false, false,
 false, false, false, false, false, false, false, false, false, false, false, false, false,
false, false, false, false, false, false, false, false, false, false, false, false, false]][12:17:56]
(err) Failed
to execute compound future reducer: GridCompoundFuture [rdc=null, initFlag=1, lsnrCalls=0,
done=false, cancelled=false, err=null, futs=[true, true, false, false, false, false, false,
false, false, false, fal
se, false, false, false, false, false, false, false, false, false, false, false, false, false,
false, false, false, false, false, false, false, false, false, false, false, false, false]]class
org.apache.igni
te.IgniteCheckedException: DataStreamer request failed [node=16a20d0c-4009-4bfa-ad6e-0261d9e3b2a3]
        at org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl$Buffer.onResponse(DataStreamerImpl.java:1785)
        at org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl$3.onMessage(DataStreamerImpl.java:333)
        at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556)
        at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184)
        at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
        at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1097)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: class org.apache.ignite.IgniteCheckedException: DataStreamer will retry data transfer
at stable topology [reqTop=AffinityTopologyVersion [topVer=56, minorTopVer=0], topVer=AffinityTopologyVersion
[topVer=56, minorTopVer=1], node=remote]
        at org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.localUpdate(DataStreamProcessor.java:343)
        at org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.processRequest(DataStreamProcessor.java:301)
        at org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.access$000(DataStreamProcessor.java:58)
        at org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor$1.onMessage(DataStreamProcessor.java:88)
        ... 7 more
{noformat}
2 drivers were able to resume streaming after some time, but 1 didn't (error messages continued
to be printed). This driver had high heap utilization, that resulted in long GC pause. Finally
it was considered failed by other nodes.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message