ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Rakov <ivan.glu...@gmail.com>
Subject Re: Large durable caches
Date Mon, 19 Feb 2018 22:56:32 GMT
It's hard to determine the problem by these messages. I don't see 
anything unhealthy regarding persistence - checkpoint start is a regular 
event.

There were cases when excessive GC load on client side seriously 
affected throughput/latency of data streaming. You may consider playing 
with the following data streamer parameters:

> public int perNodeBufferSize(int bufSize) - defines how many items 
> should be saved in buffer before send it to server node
>  public void perNodeParallelOperations(int parallelOps) - defines how 
> many buffers can be sent to the node without acknowledge that it was 
> processed

Best Regards,
Ivan Rakov

On 19.02.2018 22:24, lawrencefinn wrote:
> Okay I am trying to reproduce.  It hasn't got stuck yet, but the client got
> disconnected and reconnected recently.  I don't think it is related to GC
> because I am recording GC times and it does not jump up that much.  Could
> the system get slow on a lot of io?  i see this in the ignite log:
>
> [19:13:01,988][WARNING][grid-timeout-worker-#71][diagnostic] Found long
> running cache future [startTime=19:11:56.656, curTime=19:13:01.911,
>   fut=GridNearAtomicSingleUpdateFuture [reqState=Primary
> [id=62a2a255-3320-4040-aa23-ffb86dec7586, opRes=false, expCnt=-1, rcvdCnt=0,
> primaryRes=false, done=false, waitFor=null, rcvd=null],
> super=GridNearAtomicAbstractUpdateFuture [remapCnt=100,
> topVer=AffinityTopologyVersion [topVer=3, minorTopVer=14], remapTopVer=null,
> err=null, futId=313296239, super=GridFutureAdapter [ignoreInterrupts=false,
> state=INIT, res=null, hash=1229092316]]]]
> [19:13:01,988][WARNING][grid-timeout-worker-#71][diagnostic] Found long
> running cache future [startTime=19:11:39.917, curTime=19:13:01.911,
> fut=GridNearAtomicSingleUpdateFuture [reqState=Primary
> [id=62a2a255-3320-4040-aa23-ffb86dec7586, opRes=false, expCnt=-1, rcvdCnt=0,
> primaryRes=false, done=false, waitFor=null, rcvd=null],
> super=GridNearAtomicAbstractUpdateFuture [remapCnt=100,
> topVer=AffinityTopologyVersion [topVer=3, minorTopVer=14], remapTopVer=null,
> err=null, futId=312914655, super=GridFutureAdapter [ignoreInterrupts=false,
> state=INIT, res=null, hash=15435296]]]]
> [19:13:51,057][INFO][db-checkpoint-thread-#110][GridCacheDatabaseSharedManager]
> Checkpoint started [checkpointId=77744626-04e6-4e17-bda7-23ecb50bbe19,
> startPtr=FileWALPointer [idx=9600, fileOffset=35172819, len=124303,
> forceFlush=true], checkpointLockWait=57708ms, checkpointLockHoldTime=64ms,
> pages=3755135, reason='too many dirty pages']
> [19:14:01,919][INFO][grid-timeout-worker-#71][IgniteKernal]
> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>      ^-- Node [id=62a2a255, uptime=01:42:41.752]
>      ^-- H/N/C [hosts=2, nodes=3, CPUs=64]
>      ^-- CPU [cur=77.83%, avg=39.11%, GC=0.13%]
>      ^-- PageMemory [pages=5111642]
>      ^-- Heap [used=11669MB, free=43.02%, comm=20480MB]
>      ^-- Non heap [used=67MB, free=95.56%, comm=69MB]
>      ^-- Public thread pool [active=0, idle=0, qSize=0]
>      ^-- System thread pool [active=0, idle=6, qSize=0]
>      ^-- Outbound messages queue [size=0]
> [19:15:03,470][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery
> accepted incoming connection [rmtAddr=/127.0.0.1, rmtPort=33542]
> [19:15:03,470][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery
> spawning a new thread for connection [rmtAddr=/127.0.0.1, rmtPort=33542]
>
>
> My app log has:
> 2018-02-19 19:15:02,176 [WARN] from
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi in
> tcp-client-disco-reconnector-#5%cabooseGrid% - Timed out waiting for message
> to be read (most probably, the reason is long GC pauses on remote node)
> [curTimeout=5000, rmtAddr=/127.0.0.1:47500, rmtPort=47500]
> 2018-02-19 19:15:02,176 [ERROR] from
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi in
> tcp-client-disco-reconnector-#5%cabooseGrid% - Exception on joining: Failed
> to deserialize object with given class loader:
> sun.misc.Launcher$AppClassLoader@28d93b30
> org.apache.ignite.IgniteCheckedException: Failed to deserialize object with
> given class loader: sun.misc.Launcher$AppClassLoader@28d93b30
>          at
> org.apache.ignite.marshaller.jdk.JdkMarshaller.unmarshal0(JdkMarshaller.java:129)
>          at
> org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.unmarshal(AbstractNodeNameAwareMarshaller.java:94)
>          at
> org.apache.ignite.internal.util.IgniteUtils.unmarshal(IgniteUtils.java:9740)
>          at
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.readMessage(TcpDiscoverySpi.java:1590)
>          at
> org.apache.ignite.spi.discovery.tcp.ClientImpl.sendJoinRequest(ClientImpl.java:627)
>          at
> org.apache.ignite.spi.discovery.tcp.ClientImpl.joinTopology(ClientImpl.java:524)
>          at
> org.apache.ignite.spi.discovery.tcp.ClientImpl.access$900(ClientImpl.java:124)
>          at
> org.apache.ignite.spi.discovery.tcp.ClientImpl$Reconnector.body(ClientImpl.java:1377)
>          at
> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
> Caused by: java.net.SocketTimeoutException: Read timed out
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Mime
View raw message