hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@yahoo.com>
Subject Re: Cut a 0.2.1 release candidate?
Date Sun, 31 Aug 2008 19:48:12 GMT
Sorry, but I don't see the attachment getting through. Here is the full stack trace in case:

2008-08-27 16:12:36,822 INFO org.apache.hadoop.mapred.StatusHttpServer: Process Thread Dump:
jsp requested
36 active threads
Thread 247379 (Thread-247366):
  State: TIMED_WAITING
  Blocked count: 0
  Waited count: 552
  Stack:
    java.lang.Object.wait(Native Method)
    org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1824)
Thread 51 (IPC Server handler 15 on 60020):
  State: TIMED_WAITING
  Blocked count: 1856
  Waited count: 11461
  Stack:
    java.lang.Object.wait(Native Method)
    org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 50 (IPC Server handler 14 on 60020):
  State: TIMED_WAITING
  Blocked count: 1772
  Waited count: 11460
  Stack:
    java.lang.Object.wait(Native Method)
    org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 49 (IPC Server handler 13 on 60020):
  State: TIMED_WAITING
  Blocked count: 1780
  Waited count: 11461
  Stack:
    java.lang.Object.wait(Native Method)
    org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 48 (IPC Server handler 12 on 60020):
  State: TIMED_WAITING
  Blocked count: 1926
  Waited count: 11461
  Stack:
    java.lang.Object.wait(Native Method)
    org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 47 (IPC Server handler 11 on 60020):
  State: TIMED_WAITING
  Blocked count: 2153
  Waited count: 11463
  Stack:
    java.lang.Object.wait(Native Method)
    org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 46 (IPC Server handler 10 on 60020):
  State: TIMED_WAITING
  Blocked count: 1707
  Waited count: 11461
  Stack:
    java.lang.Object.wait(Native Method)
    org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 45 (IPC Server handler 9 on 60020):
  State: TIMED_WAITING
  Blocked count: 2011
  Waited count: 11461
  Stack:
    java.lang.Object.wait(Native Method)
    org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 44 (IPC Server handler 8 on 60020):
  State: TIMED_WAITING
  Blocked count: 1952
  Waited count: 11463
  Stack:
    java.lang.Object.wait(Native Method)
    org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 43 (IPC Server handler 7 on 60020):
  State: TIMED_WAITING
  Blocked count: 1714
  Waited count: 11461
  Stack:
    java.lang.Object.wait(Native Method)
    org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 42 (IPC Server handler 6 on 60020):
  State: TIMED_WAITING
  Blocked count: 2073
  Waited count: 11462
  Stack:
    java.lang.Object.wait(Native Method)
    org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 41 (IPC Server handler 5 on 60020):
  State: TIMED_WAITING
  Blocked count: 1802
  Waited count: 11463
  Stack:
    java.lang.Object.wait(Native Method)
    org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 40 (IPC Server handler 4 on 60020):
  State: TIMED_WAITING
  Blocked count: 1691
  Waited count: 11462
  Stack:
    java.lang.Object.wait(Native Method)
    org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 39 (IPC Server handler 3 on 60020):
  State: TIMED_WAITING
  Blocked count: 1878
  Waited count: 11464
  Stack:
    java.lang.Object.wait(Native Method)
    org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 38 (IPC Server handler 2 on 60020):
  State: TIMED_WAITING
  Blocked count: 2057
  Waited count: 11462
  Stack:
    java.lang.Object.wait(Native Method)
    org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 37 (IPC Server handler 1 on 60020):
  State: TIMED_WAITING
  Blocked count: 1929
  Waited count: 11463
  Stack:
    java.lang.Object.wait(Native Method)
    org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 36 (IPC Server handler 0 on 60020):
  State: TIMED_WAITING
  Blocked count: 1781
  Waited count: 11462
  Stack:
    java.lang.Object.wait(Native Method)
    org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 12 (IPC Server listener on 60020):
  State: RUNNABLE
  Blocked count: 0
  Waited count: 0
  Stack:
    sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
    sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
    sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
    sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
    sun.nio.ch.SelectorImpl.select(SelectorImpl.java:84)
    org.apache.hadoop.ipc.Server$Listener.run(Server.java:299)
Thread 14 (IPC Server Responder):
  State: RUNNABLE
  Blocked count: 0
  Waited count: 0
  Stack:
    sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
    sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
    sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
    sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
    org.apache.hadoop.ipc.Server$Responder.run(Server.java:445)
Thread 35 (SocketListener0-1):
  State: RUNNABLE
  Blocked count: 1
  Waited count: 42174
  Stack:
    sun.management.ThreadImpl.getThreadInfo0(Native Method)
    sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:147)
    sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:123)
    org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:114)
    org.apache.hadoop.util.ReflectionUtils.logThreadInfo(ReflectionUtils.java:168)
    org.apache.hadoop.mapred.StatusHttpServer$StackServlet.doGet(StatusHttpServer.java:259)
    javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
    javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
    org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
    org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
    org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
    org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
    org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
    org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
    org.mortbay.http.HttpServer.service(HttpServer.java:954)
    org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
    org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
    org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
    org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
    org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
Thread 34 (SocketListener0-0):
  State: RUNNABLE
  Blocked count: 2
  Waited count: 42174
  Stack:
    java.net.SocketInputStream.socketRead0(Native Method)
    java.net.SocketInputStream.read(SocketInputStream.java:129)
    org.mortbay.util.LineInput.fill(LineInput.java:469)
    org.mortbay.util.LineInput.fillLine(LineInput.java:547)
    org.mortbay.util.LineInput.readLineBuffer(LineInput.java:293)
    org.mortbay.util.LineInput.readLineBuffer(LineInput.java:277)
    org.mortbay.http.HttpRequest.readHeader(HttpRequest.java:238)
    org.mortbay.http.HttpConnection.readRequest(HttpConnection.java:861)
    org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:907)
    org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
    org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
    org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
    org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
Thread 33 (Acceptor ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=60030]):
  State: RUNNABLE
  Blocked count: 0
  Waited count: 0
  Stack:
    java.net.PlainSocketImpl.socketAccept(Native Method)
    java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
    java.net.ServerSocket.implAccept(ServerSocket.java:453)
    java.net.ServerSocket.accept(ServerSocket.java:421)
    org.mortbay.util.ThreadedServer.acceptSocket(ThreadedServer.java:432)
    org.mortbay.util.ThreadedServer$Acceptor.run(ThreadedServer.java:631)
Thread 32 (SessionScavenger):
  State: TIMED_WAITING
  Blocked count: 0
  Waited count: 14061
  Stack:
    java.lang.Thread.sleep(Native Method)
    org.mortbay.jetty.servlet.AbstractSessionManager$SessionScavenger.run(AbstractSessionManager.java:587)
Thread 15 (regionserver/0:0:0:0:0:0:0:0:60020.leaseChecker):
  State: TIMED_WAITING
  Blocked count: 0
  Waited count: 42176
  Stack:
    sun.misc.Unsafe.park(Native Method)
    java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
    java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1963)
    java.util.concurrent.DelayQueue.poll(DelayQueue.java:201)
    org.apache.hadoop.hbase.Leases.run(Leases.java:75)
Thread 11 (regionserver/0:0:0:0:0:0:0:0:60020.worker):
  State: TIMED_WAITING
  Blocked count: 0
  Waited count: 44769
  Stack:
    sun.misc.Unsafe.park(Native Method)
    java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
    java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1963)
    java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:395)
    org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:805)
    java.lang.Thread.run(Thread.java:619)
Thread 9 (regionserver/0:0:0:0:0:0:0:0:60020.compactor):
  State: TIMED_WAITING
  Blocked count: 217143
  Waited count: 393898
  Stack:
    sun.misc.Unsafe.park(Native Method)
    java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
    java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1963)
    java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:395)
    org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:76)
Thread 8 (regionserver/0:0:0:0:0:0:0:0:60020.cacheFlusher):
  State: TIMED_WAITING
  Blocked count: 9627
  Waited count: 228686
  Stack:
    sun.misc.Unsafe.park(Native Method)
    java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
    java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1963)
    java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:395)
    org.apache.hadoop.hbase.regionserver.Flusher.run(Flusher.java:87)
Thread 10 (regionserver/0:0:0:0:0:0:0:0:60020.logRoller):
  State: TIMED_WAITING
  Blocked count: 0
  Waited count: 43612
  Stack:
    java.lang.Object.wait(Native Method)
    org.apache.hadoop.hbase.regionserver.LogRoller.run(LogRoller.java:63)
Thread 30 (org.apache.hadoop.dfs.DFSClient$LeaseChecker@496614e7):
  State: TIMED_WAITING
  Blocked count: 0
  Waited count: 435045
  Stack:
    java.lang.Thread.sleep(Native Method)
    org.apache.hadoop.dfs.DFSClient$LeaseChecker.run(DFSClient.java:763)
    java.lang.Thread.run(Thread.java:619)
Thread 28 (org.apache.hadoop.io.ObjectWritable Connection Culler):
  State: TIMED_WAITING
  Blocked count: 11
  Waited count: 421031
  Stack:
    java.lang.Thread.sleep(Native Method)
    org.apache.hadoop.ipc.Client$ConnectionCuller.run(Client.java:435)
Thread 19 (org.apache.hadoop.io.ObjectWritable Connection Culler):
  State: TIMED_WAITING
  Blocked count: 148
  Waited count: 421026
  Stack:
    java.lang.Thread.sleep(Native Method)
    org.apache.hadoop.ipc.Client$ConnectionCuller.run(Client.java:435)
Thread 18 (DestroyJavaVM):
  State: RUNNABLE
  Blocked count: 0
  Waited count: 0
  Stack:
Thread 17 (regionserver/0:0:0:0:0:0:0:0:60020):
  State: TIMED_WAITING
  Blocked count: 5
  Waited count: 280887
  Stack:
    java.lang.Thread.sleep(Native Method)
    org.apache.hadoop.hbase.util.Sleeper.sleep(Sleeper.java:72)
    org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:418)
    java.lang.Thread.run(Thread.java:619)
Thread 4 (Signal Dispatcher):
  State: RUNNABLE
  Blocked count: 0
  Waited count: 0
  Stack:
Thread 3 (Finalizer):
  State: WAITING
  Blocked count: 516
  Waited count: 3543
  Waiting on java.lang.ref.ReferenceQueue$Lock@639abb58
  Stack:
    java.lang.Object.wait(Native Method)
    java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116)
    java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132)
    java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
Thread 2 (Reference Handler):
  State: WAITING
  Blocked count: 220
  Waited count: 3530
  Waiting on java.lang.ref.Reference$Lock@7bc659d1
  Stack:
    java.lang.Object.wait(Native Method)
    java.lang.Object.wait(Object.java:485)
    java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)



--- On Sun, 8/31/08, Andrew Purtell <apurtell@yahoo.com> wrote:

From: Andrew Purtell <apurtell@yahoo.com>
Subject: Re: Cut a 0.2.1 release candidate?
To: hbase-dev@hadoop.apache.org
Date: Sunday, August 31, 2008, 12:43 PM

The log...


--- On Sun, 8/31/08, Andrew Purtell <apurtell@yahoo.com> wrote:
 
From: Andrew Purtell <apurtell@yahoo.com>
Subject: Re: Cut a 0.2.1 release candidate?
To: hbase-dev@hadoop.apache.org
Date: Sunday, August 31, 2008, 12:41 PM

+1, especially the locking evaluation and de-entanglement
done by Jim and J-D as part of 810.
 
We might be seeing regionserver deadlocks with 0.2.0. See
attached partial log, including stack trace requested from
the UI, from a regionserver that reports to the master but
does not handle requests from clients, hanging them. There
is no hint that anything is amiss in the log. All of the
IPC handlers are blocked. Also I'm not sure what to make of
the high counts on CompactSplitThread.compactionQueue :
 
  Thread 9 (regionserver/0:0:0:0:0:0:0:0:60020.compactor):
    State: TIMED_WAITING
    Blocked count: 217143
    Waited count: 393898
    Stack:
      sun.misc.Unsafe.park(Native Method)
      java.util.concurrent.locks.LockSupport.parkNanos
(LockSupport.java:198)
      java.util.concurrent.locks.AbstractQueuedSynchronizer
$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1963)
      java.util.concurrent.LinkedBlockingQueue.poll
(LinkedBlockingQueue.java:395)
      org.apache.hadoop.hbase.regionserver.CompactSplitThread.run
(CompactSplitThread.java:76)

but do not think I have enough solid information to file a
JIRA and investigate/fix this yet. 

I'm hoping we just won't see this with 0.2.1. :-)

  -  Andy



      

Mime
View raw message