ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ilya Kasnacheev <ilya.kasnach...@gmail.com>
Subject Re: Performance degradation in case of high volumes
Date Fri, 22 Feb 2019 15:35:44 GMT
Hello!

It seems that, as time goes, I/O can't catch up with you.

The recommendation here is, probably, to increase checkpoint frequency
value (measured in ms; to do checkpoints less often)
Let's say, set it to 600000 (10 minutes).

The downside here is that in case of crash, node will take more time to
come online.

Regards,
-- 
Ilya Kasnacheev


пт, 22 февр. 2019 г. в 17:37, Antonio Conforti <antonio.conforti@sia.eu>:

> Hi support,
>
> I'm running a performance test writing 4000 entry per second on a cache:
> 1.      TRANSACTIONAL
> 2.      partitioned
> 3.      with backup 1 (and affinity with exclude neighbors enabled)
> 4.      write synchronization mode FULL_ASYNC
> 5.      indexed on key and value (and enabled to SQL inquiry)
>
> Writes are performed by a client node using a data stream with
> StreamVisitor
> and set autoFlushFrequency 1 sec.
>
> We have configured:
> 1.      failureDetectionTimeout to 120000msec
> 2.      Data region (only 1):
> a.      Persistence enabled
> b.      max size 8 GB
> c.      checkpointPageBufferSize 2 GB
> 3.      WAL mode LOG_ONLY
> 4.      disabled WAL archiving (WAL path and the WAL archive path to the
> same
> value)
> 5.      Pages Writes Throttling enabled
>
>
> After some hour submitting about 20 million entries without problems, the
> client node starts to accuse delays: the queue from the client node Ignite
> reads messages start to grow.
>
> Verifying the logs of server and client node there isn’t any error message
> but from the statistics of WAL  high FSYNC values are observed.
>
> Could you help me to understand why inspite a constant rate and a constant
> consumption of cpu of about 30% only after a certain amount of entry it
> seems the server slow down in term of performance?
>
> May be there is some param to tune that I missed?
>
> Below the configuration used for the simulation:
>
> Total server nodes  8 so distributed:
> HOST1 with 4 nodes server and 1 client node on HDD disk
> HOST2 with 4 nodes on HDD disk
>
>
> Both hosts are machines with 16 cores of 256 GB of memory and HHD disk.
>
> The DataStorageConfiguration for each server node is as follows:
>
>
> <property name="dataStorageConfiguration">
>                 <bean
>
> class="org.apache.ignite.configuration.DataStorageConfiguration">
>
>                         <property name="writeThrottlingEnabled"
> value="true"
> />
>                         <property name="defaultDataRegionConfiguration">
>                                 <bean
>
> class="org.apache.ignite.configuration.DataRegionConfiguration">
>                                         <property name="persistenceEnabled"
> value="true" />
>                                         <property name="maxSize"
> value="#{8L
> * 1024 * 1024 * 1024}"/>
>                                         <property
> name="checkpointPageBufferSize"
>                                                 value="#{2048L * 1024 *
> 1024}" />
>                                 </bean>
>                         </property>
>
>
>
>                         <property name="walMode" value="LOG_ONLY" />
>                         <property name="walPath" value="wal/path" />
>                         <property name="walArchivePath" value="wal/path" />
>                 </bean>
>         </property>
>
>
> JVM option used for start each server node:
>
> -server -Xms4g -Xmx8g -XX:+AlwaysPreTouch -XX:+UseG1GC
> -XX:+ScavengeBeforeFullGC -XX:+DisableExplicitGC
>
>
>
> I report the WAL statistics from log of node 1 :
>
> At Simulation start:
> 2019-02-22 10:19:44.195  INFO 5271 --- [oint-thread-#67]
> i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
> [cpId=c115ead9-643d-45e5-be41-cd7ae5caac14, pages=11891,
> markPos=FileWALPointer [idx=1, fileOff=36517886, len=79426],
> walSegmentsCleared=0, walSegmentsCovered=[0], markDuration=34ms,
> pagesWrite=87ms, fsync=1931ms, total=2052ms]
> 2019-02-22 10:22:44.742  INFO 5271 --- [oint-thread-#67]
> i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
> [cpId=d40c9096-2e10-46ca-ae8e-2a39e242b768, pages=66732,
> markPos=FileWALPointer [idx=7, fileOff=63806638, len=79426],
> walSegmentsCleared=7, walSegmentsCovered=[1 - 6], markDuration=98ms,
> pagesWrite=407ms, fsync=2085ms, total=2590ms]
> 2019-02-22 10:25:44.900  INFO 5271 --- [oint-thread-#67]
> i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
> [cpId=5124f12f-6ed8-4ed3-9644-3c58957600ed, pages=70253,
> markPos=FileWALPointer [idx=14, fileOff=47159207, len=79426],
> walSegmentsCleared=6, walSegmentsCovered=[7 - 13], markDuration=98ms,
> pagesWrite=402ms, fsync=2241ms, total=2741ms]
> 2019-02-22 10:28:47.866  INFO 5271 --- [oint-thread-#67]
> i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
> [cpId=9094d36c-cd79-4d30-9b90-e48de78fa3e6, pages=72524,
> markPos=FileWALPointer [idx=21, fileOff=39728290, len=79426],
> walSegmentsCleared=8, walSegmentsCovered=[14 - 20], markDuration=83ms,
> pagesWrite=365ms, fsync=5255ms, total=5703ms]
> 2019-02-22 10:31:53.635  INFO 5271 --- [oint-thread-#67]
> i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
> [cpId=7132d53a-e2a6-4ac8-b1a8-2621cc39c82b, pages=77471,
> markPos=FileWALPointer [idx=28, fileOff=64681287, len=79426],
> walSegmentsCleared=7, walSegmentsCovered=[21 - 27], markDuration=494ms,
> pagesWrite=748ms, fsync=10136ms, total=11472ms]
>
>
> At end of simulation
>
> 2019-02-22 11:52:36.339  INFO 5271 --- [oint-thread-#67]
> i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
> [cpId=dc8369b9-b100-4dd0-bfaf-9a5b13620072, pages=129942,
> markPos=FileWALPointer [idx=309, fileOff=19048810, len=79426],
> walSegmentsCleared=11, walSegmentsCovered=[298 - 308], markDuration=77ms,
> pagesWrite=797ms, fsync=171049ms, total=171923ms]
> 2019-02-22 11:56:24.001  INFO 5271 --- [oint-thread-#67]
> i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
> [cpId=1fb1ced8-a257-4c6f-be97-a1867ba6692e, pages=133096,
> markPos=FileWALPointer [idx=320, fileOff=13420410, len=79426],
> walSegmentsCleared=11, walSegmentsCovered=[309 - 319], markDuration=1707ms,
> pagesWrite=1537ms, fsync=216332ms, total=219576ms]
> 2019-02-22 12:00:23.052  INFO 5271 --- [oint-thread-#67]
> i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
> [cpId=0d48f6d5-7601-4839-b422-55b228978da5, pages=150587,
> markPos=FileWALPointer [idx=332, fileOff=47800048, len=79426],
> walSegmentsCleared=12, walSegmentsCovered=[320 - 331], markDuration=2275ms,
> pagesWrite=752ms, fsync=236023ms, total=239051ms]
> 2019-02-22 12:04:05.562  INFO 5271 --- [oint-thread-#67]
> i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
> [cpId=b23526c4-5121-48a9-9902-522a5ffb3a28, pages=155805,
> markPos=FileWALPointer [idx=345, fileOff=40020477, len=79426],
> walSegmentsCleared=13, walSegmentsCovered=[332 - 344], markDuration=525ms,
> pagesWrite=1324ms, fsync=220654ms, total=222504ms]
> 2019-02-22 12:07:54.005  INFO 5271 --- [oint-thread-#67]
> i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
> [cpId=1bb2a0c0-7a89-47f2-af9f-90e90d44c14b, pages=149055,
> markPos=FileWALPointer [idx=357, fileOff=51666923, len=79426],
> walSegmentsCleared=12, walSegmentsCovered=[345 - 356], markDuration=995ms,
> pagesWrite=1559ms, fsync=225888ms, total=228442ms]
> 2019-02-22 12:11:49.962  INFO 5271 --- [oint-thread-#67]
> i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
> [cpId=19f14a04-842f-409e-9c6b-afb59193e419, pages=153022,
> markPos=FileWALPointer [idx=370, fileOff=16234647, len=79426],
> walSegmentsCleared=13, walSegmentsCovered=[357 - 369], markDuration=1773ms,
> pagesWrite=1044ms, fsync=233139ms, total=235957ms]
> 2019-02-22 12:15:59.332  INFO 5271 --- [oint-thread-#67]
> i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
> [cpId=c1316fb7-1ecc-4358-bf90-a87772969c03, pages=159668,
> markPos=FileWALPointer [idx=383, fileOff=21979375, len=79426],
> walSegmentsCleared=13, walSegmentsCovered=[370 - 382], markDuration=1249ms,
> pagesWrite=1693ms, fsync=246428ms, total=249370ms]
> 2019-02-22 12:20:05.814  INFO 5271 --- [oint-thread-#67]
> i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
> [cpId=a9e394f2-0011-4f89-8b5a-a6ed77774103, pages=156891,
> markPos=FileWALPointer [idx=396, fileOff=3956799, len=79426],
> walSegmentsCleared=13, walSegmentsCovered=[383 - 395], markDuration=1030ms,
> pagesWrite=1275ms, fsync=244176ms, total=246482ms]
> 2019-02-22 12:24:40.217  INFO 5271 --- [oint-thread-#67]
> i.i.p.c.p.GridCacheDatabaseSharedManager : Checkpoint finished
> [cpId=0bc71cc5-97c4-4f7d-95c2-430f379eeeb0, pages=148039,
> markPos=FileWALPointer [idx=407, fileOff=57767331, len=79426],
> walSegmentsCleared=11, walSegmentsCovered=[396 - 406], markDuration=323ms,
> pagesWrite=1620ms, fsync=272460ms, total=274403ms]
>
>
> Thanks in advance.
>
> Antonio
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Mime
View raw message