ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Denis Magda (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (IGNITE-1294) Assertion in TCP communication SPI: client already created
Date Mon, 31 Aug 2015 14:59:45 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14723513#comment-14723513
] 

Denis Magda edited comment on IGNITE-1294 at 8/31/15 2:59 PM:
--------------------------------------------------------------

I don't see an easy way to fully fix the race condition between shmem and TCP clients creation.
It seems to require us putting more efforts on this.

Thus, for now only bugs fixed on TCP side got merged. All the changes done to fix the race
are reverted.

Let's return to this task when it become an issue of a higher priority.

The race is caused by this code snipped that is a part of {{onFirstMessage}} method of {{TcpCommunicationSpi}}.
{noformat}
                    else {
                        boolean reserved = recoveryDesc.tryReserve(msg0.connectCount(),
                                new ConnectClosure(ses, recoveryDesc, rmtNode, msg0, !hasShmemClient,
fut));

                        if (reserved)
                            connected(recoveryDesc, ses, rmtNode, msg0.received(), true, !hasShmemClient);
                    }
{noformat} 


was (Author: dmagda):
I don't see an easy way to fully fix the race condition between shmem and TCP clients creation.
It seems to require us putting more efforts on this.

Thus, for now only bugs fixed on TCP side got merged. All the changes done to fix the race
are reverted.

Let's return to this task when it become an issue with a higher priority.

The race is caused by this code snipped that is a part of {{onFirstMessage}} method of {{TcpCommunicationSpi}}.
{noformat}
                    else {
                        boolean reserved = recoveryDesc.tryReserve(msg0.connectCount(),
                                new ConnectClosure(ses, recoveryDesc, rmtNode, msg0, !hasShmemClient,
fut));

                        if (reserved)
                            connected(recoveryDesc, ses, rmtNode, msg0.received(), true, !hasShmemClient);
                    }
{noformat} 

> Assertion in TCP communication SPI: client already created
> ----------------------------------------------------------
>
>                 Key: IGNITE-1294
>                 URL: https://issues.apache.org/jira/browse/IGNITE-1294
>             Project: Ignite
>          Issue Type: Bug
>          Components: general
>    Affects Versions: 1.1.4
>            Reporter: Alexey Goncharuk
>            Assignee: Denis Magda
>         Attachments: ignite-1294.patch
>
>
> Observed this failure on TC in master branch:
> {code}
> [19:39:53]W:		 [org.apache.ignite:ignite-core] java.lang.AssertionError: Client already
created [
> 	node=TcpDiscoveryNode [id=00db22a2-37de-4d41-9a81-1b3ccb7a3000, addrs=[127.0.0.1], sockAddrs=[/127.0.0.1:47500],
discPort=47500, order=1, intOrder=1, lastExchangeTime=1440434393018, loc=false, ver=1.4.1#19700101-sha1:00000000,
isClient=false],
> 	client=GridShmemCommunicationClient [shmem=IpcSharedMemoryClientEndpoint [inSpace=IpcSharedMemorySpace
[opSize=262144, shmemPtr=139828001624128, shmemId=815824901, semId=696811527, closed=false,
isReader=true, writerPid=23710, readerPid=23710, tokFileName=/opt/TeamcityAgent/temp/buildTmp/ignite/work/ipc/shmem/00db22a2-37de-4d41-9a81-1b3ccb7a3000-23710/gg-shmem-space-1087-23710-262144,
closed=false], outSpace=IpcSharedMemorySpace [opSize=262144, shmemPtr=139828001357888, shmemId=815792132,
semId=696778758, closed=false, isReader=false, writerPid=23710, readerPid=23710, tokFileName=/opt/TeamcityAgent/temp/buildTmp/ignite/work/ipc/shmem/00db22a2-37de-4d41-9a81-1b3ccb7a3000-23710/gg-shmem-space-1086-23710-262144,
closed=false], checkIn=true, checkOut=true], writeBuf=java.nio.HeapByteBuffer[pos=0 lim=8192
cap=8192], formatter=org.apache.ignite.internal.managers.communication.GridIoManager$2@489a1849,
super=GridAbstractCommunicationClient [lastUsed=1440434393133, reserves=0]], 
> 	oldClient=GridTcpNioCommunicationClient [ses=GridSelectorNioSessionImpl [selectorIdx=0,
queueSize=0, writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0
lim=32768 cap=32768], recovery=GridNioRecoveryDescriptor [acked=0, resendCnt=0, rcvCnt=2,
reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode [id=00db22a2-37de-4d41-9a81-1b3ccb7a3000,
addrs=[127.0.0.1], sockAddrs=[/127.0.0.1:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1440434393018,
loc=false, ver=1.4.1#19700101-sha1:00000000, isClient=false], connected=true, connectCnt=0,
queueLimit=5120], super=GridNioSessionImpl [locAddr=/127.0.0.1:45254, rmtAddr=/127.0.0.1:53055,
createTime=1440434393174, closeTime=0, bytesSent=26, bytesRcvd=345, sndSchedTime=1440434393174,
lastSndTime=1440434393174, lastRcvTime=1440434393184, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter
[parser=org.apache.ignite.internal.util.nio.GridDirectParser@1cc9616c, directMode=true], GridConnectionBytesVerifyFilter],
accepted=true]], super=GridAbstractCommunicationClient [lastUsed=1440434393174, reserves=0]]]
> [19:39:53]W:		 [org.apache.ignite:ignite-core] 	at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:1909)
> [19:39:53]W:		 [org.apache.ignite:ignite-core] 	at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:1840)
> [19:39:53]W:		 [org.apache.ignite:ignite-core] 	at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:1806)
> [19:39:53]W:		 [org.apache.ignite:ignite-core] 	at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1020)
> [19:39:53]W:		 [org.apache.ignite:ignite-core] 	at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1168)
> [19:39:53]W:		 [org.apache.ignite:ignite-core] 	at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:598)
> [19:39:53]W:		 [org.apache.ignite:ignite-core] 	at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.sendLocalPartitions(GridDhtPartitionsExchangeFuture.java:932)
> [19:39:53]W:		 [org.apache.ignite:ignite-core] 	at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.sendPartitions(GridDhtPartitionsExchangeFuture.java:973)
> [19:39:53]W:		 [org.apache.ignite:ignite-core] 	at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:839)
> [19:39:53]W:		 [org.apache.ignite:ignite-core] 	at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:1122)
> [19:39:53]W:		 [org.apache.ignite:ignite-core] 	at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:108)
> [19:39:53]W:		 [org.apache.ignite:ignite-core] 	at java.lang.Thread.run(Thread.java:745)
> [19:39:53]W:		 [org.apache.ignite:ignite-core] Exception in thread "exchange-worker-#15005%replicated.GridCacheSyncReplicatedPreloadSelfTest45%"
java.lang.AssertionError: Client already created [node=TcpDiscoveryNode [id=00db22a2-37de-4d41-9a81-1b3ccb7a3000,
addrs=[127.0.0.1], sockAddrs=[/127.0.0.1:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1440434393018,
loc=false, ver=1.4.1#19700101-sha1:00000000, isClient=false], client=GridShmemCommunicationClient
[shmem=IpcSharedMemoryClientEndpoint [inSpace=IpcSharedMemorySpace [opSize=262144, shmemPtr=139828001624128,
shmemId=815824901, semId=696811527, closed=false, isReader=true, writerPid=23710, readerPid=23710,
tokFileName=/opt/TeamcityAgent/temp/buildTmp/ignite/work/ipc/shmem/00db22a2-37de-4d41-9a81-1b3ccb7a3000-23710/gg-shmem-space-1087-23710-262144,
closed=false], outSpace=IpcSharedMemorySpace [opSize=262144, shmemPtr=139828001357888, shmemId=815792132,
semId=696778758, closed=false, isReader=false, writerPid=23710, readerPid=23710, tokFileName=/opt/TeamcityAgent/temp/buildTmp/ignite/work/ipc/shmem/00db22a2-37de-4d41-9a81-1b3ccb7a3000-23710/gg-shmem-space-1086-23710-262144,
closed=false], checkIn=true, checkOut=true], writeBuf=java.nio.HeapByteBuffer[pos=0 lim=8192
cap=8192], formatter=org.apache.ignite.internal.managers.communication.GridIoManager$2@489a1849,
super=GridAbstractCommunicationClient [lastUsed=1440434393133, reserves=0]], oldClient=GridTcpNioCommunicationClient
[ses=GridSelectorNioSessionImpl [selectorIdx=0, queueSize=0, writeBuf=java.nio.DirectByteBuffer[pos=0
lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], recovery=GridNioRecoveryDescriptor
[acked=0, resendCnt=0, rcvCnt=2, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode
[id=00db22a2-37de-4d41-9a81-1b3ccb7a3000, addrs=[127.0.0.1], sockAddrs=[/127.0.0.1:47500],
discPort=47500, order=1, intOrder=1, lastExchangeTime=1440434393018, loc=false, ver=1.4.1#19700101-sha1:00000000,
isClient=false], connected=true, connectCnt=0, queueLimit=5120], super=GridNioSessionImpl
[locAddr=/127.0.0.1:45254, rmtAddr=/127.0.0.1:53055, createTime=1440434393174, closeTime=0,
bytesSent=26, bytesRcvd=345, sndSchedTime=1440434393174, lastSndTime=1440434393174, lastRcvTime=1440434393184,
readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=org.apache.ignite.internal.util.nio.GridDirectParser@1cc9616c,
directMode=true], GridConnectionBytesVerifyFilter], accepted=true]], super=GridAbstractCommunicationClient
[lastUsed=1440434393174, reserves=0]]]
> {code}
> Because of this exchange hung. It looks like Shmem and TCP clients were created concurrently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message