hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-497) RegionServer needs to recover if datanode goes down
Date Mon, 17 Mar 2008 20:18:24 GMT

    [ https://issues.apache.org/jira/browse/HBASE-497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579584#action_12579584
] 

stack commented on HBASE-497:
-----------------------------

Make one emission rather than the two that are in your patch:

{code}
+          LOG.error("Could not append to log because " + e.toString());
+          LOG.error("Opening a new writer.");
{code}

Also change it to LOG.error("Could not append to log... opening a new writer", e); i.e. pass
the actual exception... 

Otherwise, I'm good with just applying this patch and trying this tactic.  I like the way
that the new append is done within the IOE block and it in turn catches an IOE.  But on the
second throw, shouldn't we throw something else, something that will break the eternal looping
that Michael B reports -- IIUC.  It seems like an IOE won't do?


> RegionServer needs to recover if datanode goes down
> ---------------------------------------------------
>
>                 Key: HBASE-497
>                 URL: https://issues.apache.org/jira/browse/HBASE-497
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.16.0
>            Reporter: Michael Bieniosek
>            Priority: Blocker
>             Fix For: 0.1.0, 0.2.0
>
>         Attachments: 497_0.1.patch
>
>
> If I take down a datanode, the regionserver will repeatedly return this error:
> java.io.IOException: Stream closed.
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.isClosed(DFSClient.java:1875)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.writeChunk(DFSClient.java:2096)
>         at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:141)
>         at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:124)
>         at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:112)
>         at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:86)
>         at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:41)
>         at java.io.DataOutputStream.write(Unknown Source)
>         at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:977)
>         at org.apache.hadoop.hbase.HLog.append(HLog.java:377)
>         at org.apache.hadoop.hbase.HRegion.update(HRegion.java:1455)
>         at org.apache.hadoop.hbase.HRegion.batchUpdate(HRegion.java:1259)
>         at org.apache.hadoop.hbase.HRegionServer.batchUpdate(HRegionServer.java:1433)
>         at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:413)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:910)
> It appears that hbase/dfsclient does not attempt to reopen the stream.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message