hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prakash Khemani (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-3814) force regionserver to halt
Date Fri, 22 Apr 2011 21:06:05 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023408#comment-13023408
] 

Prakash Khemani commented on HBASE-3814:
----------------------------------------

I don't have access to the logs right now. The server is powered down  and I don't want to
bring it up.

In all likelihood the server that got stuck had a dfs version mismatch problem. It got stuck
in a portion of the code that Dhruba has recently introduced and only present in the internal
branch.



> force regionserver to halt
> --------------------------
>
>                 Key: HBASE-3814
>                 URL: https://issues.apache.org/jira/browse/HBASE-3814
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Prakash Khemani
>
> Once abort() on a regionserver is called we should have a timeout thread that does Runtime.halt()
if the rs gets stuck somewhere during abort processing.
> ===
> Pumahbase132 has following the logs .. the dfsclient is not able to set up a write pipeline
successfully ... it tries to abort ... but while aborting it gets stuck. I know there is a
check that if we are aborting because filesystem is closed then we should not try to flush
the logs while aborting. But in this case the fs is up and running, just that it is not functioning.
> 2011-04-21 23:48:07,082 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream
10.38.131.53:50010  for file /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280java.io.IOException:
Bad connect ack with firstBadLink 10.38.133.33:50010
> 2011-04-21 23:48:07,082 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-8967376451767492285_6537229
for file /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280
> 2011-04-21 23:48:07,125 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream
10.38.131.53:50010  for file /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280java.io.IOException:
Bad connect ack with firstBadLink 10.38.134.59:50010
> 2011-04-21 23:48:07,125 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_7172251852699100447_6537229
for file /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280

> 2011-04-21 23:48:07,169 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream
10.38.131.53:50010  for file /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280java.io.IOException:
Bad connect ack with firstBadLink 10.38.134.53:50010
> 2011-04-21 23:48:07,169 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-9153204772467623625_6537229
for file /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280
> 2011-04-21 23:48:07,213 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream
10.38.131.53:50010  for file /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280java.io.IOException:
Bad connect ack with firstBadLink 10.38.134.49:50010
> 2011-04-21 23:48:07,213 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-2513098940934276625_6537229
for file /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280
> 2011-04-21 23:48:07,214 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception:
java.io.IOException: Unable to create new block.
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3560)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2700(DFSClient.java:2720)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2977)
> 2011-04-21 23:48:07,214 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
blk_-2513098940934276625_6537229 bad datanode[1] nodes == null
> 2011-04-21 23:48:07,214 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations.
Source file "/PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280"
- Aborting...
> 2011-04-21 23:48:07,216 FATAL org.apache.hadoop.hbase.regionserver.wal.HLog: Could not
append. Requesting close of hlog
> And then the RS gets stuck trying to roll the logs ...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message