hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-14926) Hung ThriftServer; no timeout on read from client; if client crashes, worker thread gets stuck reading
Date Fri, 04 Dec 2015 21:12:10 GMT

     [ https://issues.apache.org/jira/browse/HBASE-14926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

stack updated HBASE-14926:
    Attachment: 14926v2.txt

Ok. Played more trying to capture thriftserver in described state. I saw the issue in thread
dump but proved elusive (wasn't sure if it my timeout that fixed it or not). The change is
pretty basic so should be safe enough... setting a timeout on server socket... which we want
anyways.  Going to commit on back of @apurtell +1. v2 adds logging of the timeout to thrift2
version too and adds a sentence or two more to examples on how to get going with thrift server.

> Hung ThriftServer; no timeout on read from client; if client crashes, worker thread gets
stuck reading
> ------------------------------------------------------------------------------------------------------
>                 Key: HBASE-14926
>                 URL: https://issues.apache.org/jira/browse/HBASE-14926
>             Project: HBase
>          Issue Type: Bug
>          Components: Thrift
>    Affects Versions: 2.0.0, 1.2.0, 1.1.2, 1.3.0, 1.0.3, 0.98.16
>            Reporter: stack
>            Assignee: stack
>         Attachments: 14926.patch, 14926v2.txt
> Thrift server is hung. All worker threads are doing this:
> {code}
> "thrift-worker-0" daemon prio=10 tid=0x00007f0bb95c2800 nid=0xf6a7 runnable [0x00007f0b956e0000]
>    java.lang.Thread.State: RUNNABLE
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.read(SocketInputStream.java:152)
>         at java.net.SocketInputStream.read(SocketInputStream.java:122)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>         at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>         - locked <0x000000066d859490> (a java.io.BufferedInputStream)
>         at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
>         at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>         at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
>         at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
>         at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>         at org.apache.thrift.protocol.TCompactProtocol.readByte(TCompactProtocol.java:601)
>         at org.apache.thrift.protocol.TCompactProtocol.readMessageBegin(TCompactProtocol.java:470)
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27)
>         at org.apache.hadoop.hbase.thrift.TBoundedThreadPoolServer$ClientConnnection.run(TBoundedThreadPoolServer.java:289)
>         at org.apache.hadoop.hbase.thrift.CallQueue$Call.run(CallQueue.java:64)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> They never recover.
> I don't have client side logs.
> We've been here before: HBASE-4967 "connected client thrift sockets should have a server
side read timeout" but this patch only got applied to fb branch (and thrift has changed since

This message was sent by Atlassian JIRA

View raw message