hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yongjun Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HADOOP-14198) Should have a way to let PingInputStream to abort
Date Sun, 19 Mar 2017 06:53:41 GMT
Yongjun Zhang created HADOOP-14198:
--------------------------------------

             Summary: Should have a way to let PingInputStream to abort
                 Key: HADOOP-14198
                 URL: https://issues.apache.org/jira/browse/HADOOP-14198
             Project: Hadoop Common
          Issue Type: Bug
            Reporter: Yongjun Zhang


We observed a case that RPC call get stuck, since PingInputStream does the following

{code}
 /** This class sends a ping to the remote side when timeout on
     * reading. If no failure is detected, it retries until at least
     * a byte is read.
     */
    private class PingInputStream extends FilterInputStream {
{code}

It seems that in this case no data is ever received, and it keeps pinging.

Should we ping forever here? Maybe we should introduce a config to stop the ping after pinging
for certain number of times, and report back timeout, let the caller to retry the RPC?

Wonder if there is chance the RPC get dropped somehow by the server so no response is ever
received.

See 
{code}
Thread 16127: (state = BLOCKED)                                                          
          
 - sun.nio.ch.SocketChannelImpl.readerCleanup() @bci=6, line=279 (Compiled frame)        
          
 - sun.nio.ch.SocketChannelImpl.read(java.nio.ByteBuffer) @bci=205, line=390 (Compiled frame)
      
 - org.apache.hadoop.net.SocketInputStream$Reader.performIO(java.nio.ByteBuffer) @bci=5, line=57
(Compiled frame)
 - org.apache.hadoop.net.SocketIOWithTimeout.doIO(java.nio.ByteBuffer, int) @bci=35, line=142
(Compiled frame)
 - org.apache.hadoop.net.SocketInputStream.read(java.nio.ByteBuffer) @bci=6, line=161 (Compiled
frame)
 - org.apache.hadoop.net.SocketInputStream.read(byte[], int, int) @bci=7, line=131 (Compiled
frame) 
 - java.io.FilterInputStream.read(byte[], int, int) @bci=7, line=133 (Compiled frame)    
          
 - java.io.FilterInputStream.read(byte[], int, int) @bci=7, line=133 (Compiled frame)    
          
 - org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(byte[], int, int) @bci=4,
line=521 (Compiled frame)
 - java.io.BufferedInputStream.fill() @bci=214, line=246 (Compiled frame)                
          
 - java.io.BufferedInputStream.read() @bci=12, line=265 (Compiled frame)                 
          
 - java.io.DataInputStream.readInt() @bci=4, line=387 (Compiled frame)                   
          
 - org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse() @bci=19, line=1081 (Compiled
frame) 
 - org.apache.hadoop.ipc.Client$Connection.run() @bci=62, line=976 (Compiled frame)      
  
{code}


 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message