hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (HADOOP-2682) regionserver wedged: thread holds lock during blocking read (?)
Date Wed, 23 Jan 2008 23:24:34 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack resolved HADOOP-2682.
---------------------------

       Resolution: Invalid
    Fix Version/s: 0.16.0

Chatting w/ Michael, turns out a bunch of datanodes were stuck in swap-hell.  Speculation
therefore is that the reads were actually doing something, just super slowly.  Meantime threads
were blocked because of the synchronized getReader (fix is in patch for HADOOP-2692).   We
agreed to resolve this issue Invalid.

> regionserver wedged: thread holds lock during blocking read (?)
> ---------------------------------------------------------------
>
>                 Key: HADOOP-2682
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2682
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/hbase
>    Affects Versions: 0.15.0
>            Reporter: Michael Bieniosek
>             Fix For: 0.16.0
>
>
> Using a patched version of 0.15.0, my regionserver freezes up, saying repeatedly:
> 2008-01-22 23:10:12,617 WARN org.apache.hadoop.ipc.Server: IPC Server handler 1 on 60020,
call getProtocolVersion(org.
> apache.hadoop.hbase.HRegionInterface, 1) from xxx.xxx.xxx.178:55116: discarded for being
too old (46464)
> Then I get:
> 2008-01-22 23:11:12,644 WARN org.apache.hadoop.ipc.Server: IPC Server handler 7 on 60020,
call openScanner(-ROOT-,,0, [Lorg.apache.hadoop.io.Text;@19958bf9, .META., 1201043326058,
null) from xxx.xxx.45.192:44993: output error
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(Unknown Source)
>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>         at org.apache.hadoop.ipc.SocketChannelOutputStream.flushBuffer(SocketChannelOutputStream.java:108)
>         at org.apache.hadoop.ipc.SocketChannelOutputStream.write(SocketChannelOutputStream.java:89)
>         at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
>         at java.io.BufferedOutputStream.flush(Unknown Source)
>         at java.io.DataOutputStream.flush(Unknown Source)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:615)
> 2008-01-22 23:11:12,660 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding
call getProtocolVersion(org.apache.hadoop.hbase.HRegionInterface, 1) from xxx.xxx.45.224:34506
> 29 active threads
> Thread 4329 (IPC Client connection to xxx.xxx.44.135:10000):
>   State: WAITING
>   Blocked count: 1
>   Waited count: 4
>   Waiting on org.apache.hadoop.ipc.Client$Connection@21e1962d
>   Stack:
>     java.lang.Object.wait(Native Method)
>     java.lang.Object.wait(Unknown Source)
>     org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:216)
>     org.apache.hadoop.ipc.Client$Connection.run(Client.java:255)
> Thread 37 (IPC Server handler 9 on 60020):
>   State: BLOCKED
>   Blocked count: 1
>   Waited count: 202
>   Blocked on org.apache.hadoop.hbase.HStoreFile@72e8a021
>   Blocked by 29 (IPC Server handler 1 on 60020)
>   Stack:
>     org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936)
>     org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057)
>     org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203)
>     org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025)
>     org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699)
>     org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271)
>     org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499)
>     sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>     sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>     java.lang.reflect.Method.invoke(Unknown Source)
>     org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>     org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> Thread 36 (IPC Server handler 8 on 60020):
>   State: BLOCKED
>   Blocked count: 1
>   Waited count: 201
>   Blocked on org.apache.hadoop.hbase.HStoreFile@72e8a021
>   Blocked by 29 (IPC Server handler 1 on 60020)
>   Stack:
>     org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936)
>     org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057)
>     org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203)
>     org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025)
>     org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699)
>     org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271)
>     org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499)
>     sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>     sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>     java.lang.reflect.Method.invoke(Unknown Source)
>     org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>     org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> Thread 35 (IPC Server handler 7 on 60020):
>   State: RUNNABLE
>   Blocked count: 4
>   Waited count: 240
>   Stack:
>     sun.management.ThreadImpl.getThreadInfo0(Native Method)
>     sun.management.ThreadImpl.getThreadInfo(Unknown Source)
>     sun.management.ThreadImpl.getThreadInfo(Unknown Source)
>     org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:114)
>     org.apache.hadoop.util.ReflectionUtils.logThreadInfo(ReflectionUtils.java:162)
>     org.apache.hadoop.ipc.Server$Handler.run(Server.java:579)
> Thread 34 (IPC Server handler 6 on 60020):
>   State: BLOCKED
>   Blocked count: 1
>   Waited count: 203
>   Blocked on org.apache.hadoop.hbase.HStoreFile@72e8a021
>   Blocked by 29 (IPC Server handler 1 on 60020)
>   Stack:
>     org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936)
>     org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057)
>     org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203)
>     org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025)
>     org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699)
>     org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271)
>     org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499)
>     sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>     sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>     java.lang.reflect.Method.invoke(Unknown Source)
>     org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>     org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> Thread 33 (IPC Server handler 5 on 60020):
>   State: BLOCKED
>   Blocked count: 1
>   Waited count: 212
>   Blocked on org.apache.hadoop.hbase.HStoreFile@72e8a021
>   Blocked by 29 (IPC Server handler 1 on 60020)
>   Stack:
>     org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936)
>     org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057)
>     org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203)
>     org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025)
>     org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699)
>     org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271)
>     org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499)
>     sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>     sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>     java.lang.reflect.Method.invoke(Unknown Source)
>     org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>     org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> Thread 32 (IPC Server handler 4 on 60020):
>   State: BLOCKED
>   Blocked count: 1
>   Waited count: 202
>   Blocked on org.apache.hadoop.hbase.HStoreFile@72e8a021
>   Blocked by 29 (IPC Server handler 1 on 60020)
>   Stack:
>     org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936)
>     org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057)
>     org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203)
>     org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025)
>     org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699)
>     org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271)
>     org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499)
>     sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>     sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>     java.lang.reflect.Method.invoke(Unknown Source)
>     org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>     org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> Thread 31 (IPC Server handler 3 on 60020):
>   State: BLOCKED
>   Blocked count: 1
>   Waited count: 210
>   Blocked on org.apache.hadoop.hbase.HStoreFile@72e8a021
>   Blocked by 29 (IPC Server handler 1 on 60020)
>   Stack:
>     org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936)
>     org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057)
>     org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203)
>     org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025)
>     org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699)
>     org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271)
>     org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499)
>     sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>     sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>     java.lang.reflect.Method.invoke(Unknown Source)
>     org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>     org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> Thread 30 (IPC Server handler 2 on 60020):
>   State: BLOCKED
>   Blocked count: 1
>   Waited count: 216
>   Blocked on org.apache.hadoop.hbase.HStoreFile@72e8a021
>   Blocked by 29 (IPC Server handler 1 on 60020)
>   Stack:
>     org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936)
>     org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057)
>     org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203)
>     org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025)
>     org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699)
>     org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271)
>     org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499)
>     sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>     sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>     java.lang.reflect.Method.invoke(Unknown Source)
>     org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>     org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> Thread 29 (IPC Server handler 1 on 60020):
>   State: RUNNABLE
>   Blocked count: 3
>   Waited count: 229
>   Stack:
>     java.net.SocketInputStream.socketRead0(Native Method)
>     java.net.SocketInputStream.read(Unknown Source)
>     java.io.BufferedInputStream.fill(Unknown Source)
>     java.io.BufferedInputStream.read(Unknown Source)
>     java.io.DataInputStream.readShort(Unknown Source)
>     org.apache.hadoop.dfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:773)
>     org.apache.hadoop.dfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1000)
>     org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:1096)
>     java.io.DataInputStream.readFully(Unknown Source)
>     java.io.DataInputStream.readFully(Unknown Source)
>     org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1383)
>     org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1360)
>     org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1349)
>     org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1344)
>     org.apache.hadoop.io.MapFile$Reader.<init>(MapFile.java:263)
>     org.apache.hadoop.io.MapFile$Reader.<init>(MapFile.java:242)
>     org.apache.hadoop.hbase.HStoreFile$BloomFilterMapFile$Reader.<init>(HStoreFile.java:816)
>     org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936)
>     org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057)
>     org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203)
> Thread 28 (IPC Server handler 0 on 60020):
>   State: BLOCKED
>   Blocked count: 1
>   Waited count: 200
>   Blocked on org.apache.hadoop.hbase.HStoreFile@72e8a021
>   Blocked by 29 (IPC Server handler 1 on 60020)
>   Stack:
>     org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936)
>     org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057)
>     org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203)
>     org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025)
>     org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699)
>     org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271)
>     org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499)
>     sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>     sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>     java.lang.reflect.Method.invoke(Unknown Source)
>     org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>     org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> Thread 14 (IPC Server listener on 60020):
>   State: RUNNABLE
>   Blocked count: 0
>   Waited count: 0
>   Stack:
>     sun.nio.ch.PollArrayWrapper.poll0(Native Method)
>     sun.nio.ch.PollArrayWrapper.poll(Unknown Source)
>     sun.nio.ch.PollSelectorImpl.doSelect(Unknown Source)
>     sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source)
>     sun.nio.ch.SelectorImpl.select(Unknown Source)
>     sun.nio.ch.SelectorImpl.select(Unknown Source)
>     org.apache.hadoop.ipc.Server$Listener.run(Server.java:285)
> ... etc.
> Note that most of the threads are blocked on thread 29, which is in turn blocked on a
socket read.  This node was hosting the ROOT region, so when it got wedged it made hbase unusable
(since the master won't reassign ROOT until the server actually goes down).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message