hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2682) regionserver wedged: thread holds lock during blocking read (?)
Date Wed, 23 Jan 2008 03:06:34 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561567#action_12561567
] 

stack commented on HADOOP-2682:
-------------------------------

I see a second instance later in the logs:

{code}
Thread 28 (IPC Server handler 0 on 60020):
  State: RUNNABLE
  Blocked count: 19232
  Waited count: 178149
  Stack:
    java.net.SocketInputStream.socketRead0(Native Method)
    java.net.SocketInputStream.read(Unknown Source)
    java.io.BufferedInputStream.fill(Unknown Source)
    java.io.BufferedInputStream.read(Unknown Source)
    java.io.DataInputStream.readShort(Unknown Source)
    org.apache.hadoop.dfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:773)
    org.apache.hadoop.dfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1000)
    org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:1096)
    java.io.DataInputStream.readFully(Unknown Source)
    java.io.DataInputStream.readFully(Unknown Source)
    org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1383)
    org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1360)
    org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1349)
    org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1344)
    org.apache.hadoop.io.MapFile$Reader.<init>(MapFile.java:254)
    org.apache.hadoop.io.MapFile$Reader.<init>(MapFile.java:242)
    org.apache.hadoop.hbase.HStoreFile$BloomFilterMapFile$Reader.<init>(HStoreFile.java:816)
    org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936)
    org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057)
    org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203)
{code}

getReader is synchronized.  It looks like the lock could be removed (At least some of the
tests done in the method are against finalized data members).  Doesn't address why we're hung
reading from hdfs.  We're trying to read the SequenceFile version block, it looks like.  Your
dfs.socket.timeout is reasonable?  We set this on the socket as read timeout so you'd think
the reads would eventually give up (and then the datanode gets added to list of deadnodes).

> regionserver wedged: thread holds lock during blocking read (?)
> ---------------------------------------------------------------
>
>                 Key: HADOOP-2682
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2682
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>    Affects Versions: 0.15.0
>            Reporter: Michael Bieniosek
>
> Using a patched version of 0.15.0, my regionserver freezes up, saying repeatedly:
> 2008-01-22 23:10:12,617 WARN org.apache.hadoop.ipc.Server: IPC Server handler 1 on 60020,
call getProtocolVersion(org.
> apache.hadoop.hbase.HRegionInterface, 1) from xxx.xxx.xxx.178:55116: discarded for being
too old (46464)
> Then I get:
> 2008-01-22 23:11:12,644 WARN org.apache.hadoop.ipc.Server: IPC Server handler 7 on 60020,
call openScanner(-ROOT-,,0, [Lorg.apache.hadoop.io.Text;@19958bf9, .META., 1201043326058,
null) from xxx.xxx.45.192:44993: output error
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(Unknown Source)
>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>         at org.apache.hadoop.ipc.SocketChannelOutputStream.flushBuffer(SocketChannelOutputStream.java:108)
>         at org.apache.hadoop.ipc.SocketChannelOutputStream.write(SocketChannelOutputStream.java:89)
>         at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
>         at java.io.BufferedOutputStream.flush(Unknown Source)
>         at java.io.DataOutputStream.flush(Unknown Source)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:615)
> 2008-01-22 23:11:12,660 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding
call getProtocolVersion(org.apache.hadoop.hbase.HRegionInterface, 1) from xxx.xxx.45.224:34506
> 29 active threads
> Thread 4329 (IPC Client connection to xxx.xxx.44.135:10000):
>   State: WAITING
>   Blocked count: 1
>   Waited count: 4
>   Waiting on org.apache.hadoop.ipc.Client$Connection@21e1962d
>   Stack:
>     java.lang.Object.wait(Native Method)
>     java.lang.Object.wait(Unknown Source)
>     org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:216)
>     org.apache.hadoop.ipc.Client$Connection.run(Client.java:255)
> Thread 37 (IPC Server handler 9 on 60020):
>   State: BLOCKED
>   Blocked count: 1
>   Waited count: 202
>   Blocked on org.apache.hadoop.hbase.HStoreFile@72e8a021
>   Blocked by 29 (IPC Server handler 1 on 60020)
>   Stack:
>     org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936)
>     org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057)
>     org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203)
>     org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025)
>     org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699)
>     org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271)
>     org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499)
>     sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>     sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>     java.lang.reflect.Method.invoke(Unknown Source)
>     org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>     org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> Thread 36 (IPC Server handler 8 on 60020):
>   State: BLOCKED
>   Blocked count: 1
>   Waited count: 201
>   Blocked on org.apache.hadoop.hbase.HStoreFile@72e8a021
>   Blocked by 29 (IPC Server handler 1 on 60020)
>   Stack:
>     org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936)
>     org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057)
>     org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203)
>     org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025)
>     org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699)
>     org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271)
>     org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499)
>     sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>     sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>     java.lang.reflect.Method.invoke(Unknown Source)
>     org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>     org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> Thread 35 (IPC Server handler 7 on 60020):
>   State: RUNNABLE
>   Blocked count: 4
>   Waited count: 240
>   Stack:
>     sun.management.ThreadImpl.getThreadInfo0(Native Method)
>     sun.management.ThreadImpl.getThreadInfo(Unknown Source)
>     sun.management.ThreadImpl.getThreadInfo(Unknown Source)
>     org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:114)
>     org.apache.hadoop.util.ReflectionUtils.logThreadInfo(ReflectionUtils.java:162)
>     org.apache.hadoop.ipc.Server$Handler.run(Server.java:579)
> Thread 34 (IPC Server handler 6 on 60020):
>   State: BLOCKED
>   Blocked count: 1
>   Waited count: 203
>   Blocked on org.apache.hadoop.hbase.HStoreFile@72e8a021
>   Blocked by 29 (IPC Server handler 1 on 60020)
>   Stack:
>     org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936)
>     org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057)
>     org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203)
>     org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025)
>     org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699)
>     org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271)
>     org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499)
>     sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>     sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>     java.lang.reflect.Method.invoke(Unknown Source)
>     org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>     org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> Thread 33 (IPC Server handler 5 on 60020):
>   State: BLOCKED
>   Blocked count: 1
>   Waited count: 212
>   Blocked on org.apache.hadoop.hbase.HStoreFile@72e8a021
>   Blocked by 29 (IPC Server handler 1 on 60020)
>   Stack:
>     org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936)
>     org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057)
>     org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203)
>     org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025)
>     org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699)
>     org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271)
>     org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499)
>     sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>     sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>     java.lang.reflect.Method.invoke(Unknown Source)
>     org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>     org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> Thread 32 (IPC Server handler 4 on 60020):
>   State: BLOCKED
>   Blocked count: 1
>   Waited count: 202
>   Blocked on org.apache.hadoop.hbase.HStoreFile@72e8a021
>   Blocked by 29 (IPC Server handler 1 on 60020)
>   Stack:
>     org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936)
>     org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057)
>     org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203)
>     org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025)
>     org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699)
>     org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271)
>     org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499)
>     sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>     sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>     java.lang.reflect.Method.invoke(Unknown Source)
>     org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>     org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> Thread 31 (IPC Server handler 3 on 60020):
>   State: BLOCKED
>   Blocked count: 1
>   Waited count: 210
>   Blocked on org.apache.hadoop.hbase.HStoreFile@72e8a021
>   Blocked by 29 (IPC Server handler 1 on 60020)
>   Stack:
>     org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936)
>     org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057)
>     org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203)
>     org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025)
>     org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699)
>     org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271)
>     org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499)
>     sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>     sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>     java.lang.reflect.Method.invoke(Unknown Source)
>     org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>     org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> Thread 30 (IPC Server handler 2 on 60020):
>   State: BLOCKED
>   Blocked count: 1
>   Waited count: 216
>   Blocked on org.apache.hadoop.hbase.HStoreFile@72e8a021
>   Blocked by 29 (IPC Server handler 1 on 60020)
>   Stack:
>     org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936)
>     org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057)
>     org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203)
>     org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025)
>     org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699)
>     org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271)
>     org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499)
>     sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>     sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>     java.lang.reflect.Method.invoke(Unknown Source)
>     org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>     org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> Thread 29 (IPC Server handler 1 on 60020):
>   State: RUNNABLE
>   Blocked count: 3
>   Waited count: 229
>   Stack:
>     java.net.SocketInputStream.socketRead0(Native Method)
>     java.net.SocketInputStream.read(Unknown Source)
>     java.io.BufferedInputStream.fill(Unknown Source)
>     java.io.BufferedInputStream.read(Unknown Source)
>     java.io.DataInputStream.readShort(Unknown Source)
>     org.apache.hadoop.dfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:773)
>     org.apache.hadoop.dfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1000)
>     org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:1096)
>     java.io.DataInputStream.readFully(Unknown Source)
>     java.io.DataInputStream.readFully(Unknown Source)
>     org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1383)
>     org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1360)
>     org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1349)
>     org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1344)
>     org.apache.hadoop.io.MapFile$Reader.<init>(MapFile.java:263)
>     org.apache.hadoop.io.MapFile$Reader.<init>(MapFile.java:242)
>     org.apache.hadoop.hbase.HStoreFile$BloomFilterMapFile$Reader.<init>(HStoreFile.java:816)
>     org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936)
>     org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057)
>     org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203)
> Thread 28 (IPC Server handler 0 on 60020):
>   State: BLOCKED
>   Blocked count: 1
>   Waited count: 200
>   Blocked on org.apache.hadoop.hbase.HStoreFile@72e8a021
>   Blocked by 29 (IPC Server handler 1 on 60020)
>   Stack:
>     org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936)
>     org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057)
>     org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203)
>     org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025)
>     org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699)
>     org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271)
>     org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499)
>     sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>     sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>     java.lang.reflect.Method.invoke(Unknown Source)
>     org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>     org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> Thread 14 (IPC Server listener on 60020):
>   State: RUNNABLE
>   Blocked count: 0
>   Waited count: 0
>   Stack:
>     sun.nio.ch.PollArrayWrapper.poll0(Native Method)
>     sun.nio.ch.PollArrayWrapper.poll(Unknown Source)
>     sun.nio.ch.PollSelectorImpl.doSelect(Unknown Source)
>     sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source)
>     sun.nio.ch.SelectorImpl.select(Unknown Source)
>     sun.nio.ch.SelectorImpl.select(Unknown Source)
>     org.apache.hadoop.ipc.Server$Listener.run(Server.java:285)
> ... etc.
> Note that most of the threads are blocked on thread 29, which is in turn blocked on a
socket read.  This node was hosting the ROOT region, so when it got wedged it made hbase unusable
(since the master won't reassign ROOT until the server actually goes down).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message