ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pavel Voronkin (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (IGNITE-11288) TcpDiscovery deadlock on SSLSocket.close().
Date Tue, 12 Feb 2019 19:29:00 GMT

     [ https://issues.apache.org/jira/browse/IGNITE-11288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Pavel Voronkin updated IGNITE-11288:
------------------------------------
    Description: 
Rootcause is java bug locking on SSLSocketImpl.close() on write lock:

//we create socket with soTimeout(0) here, but setting it here won't help anyway.
 RingMessageWorker: 3152 sock = spi.openSocket(addr, timeoutHelper);

//After timeout grid-timeout-worker blocks forever but SSLSOcketImpl.close() onTimeout hangs
on writeLock. 

According to java8 SSLSocketImpl:
{code:java}
if (var1.isAlert((byte)0) && this.getSoLinger() >= 0) {
boolean var3 = Thread.interrupted();

try {
if (this.writeLock.tryLock((long)this.getSoLinger(), TimeUnit.SECONDS)) {
try

{ this.writeRecordInternal(var1, var2); }

finally 
{ this.writeLock.unlock(); }
} else

{ SSLException var4 = new SSLException("SO_LINGER timeout, close_notify message cannot be
sent."); if (this.isLayered() && !this.autoClose) { this.fatal((byte)-1, (Throwable)var4);
}

else if (debug != null && Debug.isOn("ssl")) 
{ System.out.println(Thread.currentThread().getName() + ", received Exception: " + var4);
}

this.sess.invalidate();
}
} catch (InterruptedException var14) 
{ var3 = true; }

if (var3) 
{ Thread.currentThread().interrupt(); }
} else

{ this.writeLock.lock(); try { this.writeRecordInternal(var1, var2); }

finally

{ this.writeLock.unlock(); }

}{code}
In case of soLinger is not set we fallback to this.writeLock.lock(); which wait forever,
cause RingMessageWorker is writing message with SO_TIMEOUT zero.

Solution:

1) Set proper SO_TIMEOUT //that didn't help on Linux in case we drop packets using iptables.

2) Set SO_LINGER to some reasonable positive value.

Similar JDK bug [https://bugs.openjdk.java.net/browse/JDK-6668261].

Guys end up setting SO_LINGER> 

 

  was:
Rootcause is java bug locking on SSLSocketImpl.close() on write lock:

//we create socket with soTimeout(0) here, but setting it here won't help anyway.
 RingMessageWorker: 3152 sock = spi.openSocket(addr, timeoutHelper);

//After timeout grid-timeout-worker blocks forever but SSLSOcketImpl.close() onTimeout hangs
on writeLock. 

According to java8 SSLSocketImpl:
{code:java}
if (var1.isAlert((byte)0) && this.getSoLinger() >= 0) {
boolean var3 = Thread.interrupted();

try {
if (this.writeLock.tryLock((long)this.getSoLinger(), TimeUnit.SECONDS)) {
try

{ this.writeRecordInternal(var1, var2); }

finally 
{ this.writeLock.unlock(); }
} else

{ SSLException var4 = new SSLException("SO_LINGER timeout, close_notify message cannot be
sent."); if (this.isLayered() && !this.autoClose) { this.fatal((byte)-1, (Throwable)var4);
}

else if (debug != null && Debug.isOn("ssl")) 
{ System.out.println(Thread.currentThread().getName() + ", received Exception: " + var4);
}

this.sess.invalidate();
}
} catch (InterruptedException var14) 
{ var3 = true; }

if (var3) 
{ Thread.currentThread().interrupt(); }
} else

{ this.writeLock.lock(); try { this.writeRecordInternal(var1, var2); }

finally

{ this.writeLock.unlock(); }

}{code}
In case of soLinger is not set we fallback to this.writeLock.lock(); which wait forever,
cause RingMessageWorker is writing message with SO_TIMEOUT zero.

Solution:

1) Set proper SO_TIMEOUT //we checked that didn' help on Linux if drop packets using iptables
.

2) Set SO_LINGER to some reasonable positive value.

Similar JDK bug [https://bugs.openjdk.java.net/browse/JDK-6668261].

 


> TcpDiscovery deadlock on SSLSocket.close().
> -------------------------------------------
>
>                 Key: IGNITE-11288
>                 URL: https://issues.apache.org/jira/browse/IGNITE-11288
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Pavel Voronkin
>            Assignee: Pavel Voronkin
>            Priority: Critical
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Rootcause is java bug locking on SSLSocketImpl.close() on write lock:
> //we create socket with soTimeout(0) here, but setting it here won't help anyway.
>  RingMessageWorker: 3152 sock = spi.openSocket(addr, timeoutHelper);
> //After timeout grid-timeout-worker blocks forever but SSLSOcketImpl.close() onTimeout
hangs on writeLock. 
> According to java8 SSLSocketImpl:
> {code:java}
> if (var1.isAlert((byte)0) && this.getSoLinger() >= 0) {
> boolean var3 = Thread.interrupted();
> try {
> if (this.writeLock.tryLock((long)this.getSoLinger(), TimeUnit.SECONDS)) {
> try
> { this.writeRecordInternal(var1, var2); }
> finally 
> { this.writeLock.unlock(); }
> } else
> { SSLException var4 = new SSLException("SO_LINGER timeout, close_notify message cannot
be sent."); if (this.isLayered() && !this.autoClose) { this.fatal((byte)-1, (Throwable)var4);
}
> else if (debug != null && Debug.isOn("ssl")) 
> { System.out.println(Thread.currentThread().getName() + ", received Exception: " + var4);
}
> this.sess.invalidate();
> }
> } catch (InterruptedException var14) 
> { var3 = true; }
> if (var3) 
> { Thread.currentThread().interrupt(); }
> } else
> { this.writeLock.lock(); try { this.writeRecordInternal(var1, var2); }
> finally
> { this.writeLock.unlock(); }
> }{code}
> In case of soLinger is not set we fallback to this.writeLock.lock(); which wait forever,
cause RingMessageWorker is writing message with SO_TIMEOUT zero.
> Solution:
> 1) Set proper SO_TIMEOUT //that didn't help on Linux in case we drop packets using iptables.
> 2) Set SO_LINGER to some reasonable positive value.
> Similar JDK bug [https://bugs.openjdk.java.net/browse/JDK-6668261].
> Guys end up setting SO_LINGER> 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message