hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10452) Fix potential bugs in exception handlers
Date Fri, 14 Feb 2014 01:30:27 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900995#comment-13900995
] 

Hudson commented on HBASE-10452:
--------------------------------

FAILURE: Integrated in HBase-TRUNK-on-Hadoop-1.1 #89 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/89/])
HBASE-10452 Fix potential bugs in exception handlers (Ding Yuan) (tedyu: rev 1567979)
* /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java
* /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Get.java
* /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Scan.java
* /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/HBaseConfiguration.java
* /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/util/JVM.java
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/KeyPrefixRegionSplitPolicy.java
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java


> Fix potential bugs in exception handlers
> ----------------------------------------
>
>                 Key: HBASE-10452
>                 URL: https://issues.apache.org/jira/browse/HBASE-10452
>             Project: HBase
>          Issue Type: Bug
>          Components: Client, master, regionserver, util
>    Affects Versions: 0.96.1
>            Reporter: Ding Yuan
>            Assignee: Ding Yuan
>             Fix For: 0.98.1, 0.99.0
>
>         Attachments: HBase-10452-trunk-v2.patch, HBase-10452-trunk-v3.patch, HBase-10452-trunk-v4.patch,
HBase-10452-trunk.patch
>
>
> Hi HBase developers,
> We are a group of researchers on software reliability. Recently we did a study and found
that majority of the most severe failures in HBase are caused by bugs in exception handling
logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore
we built a simple checking tool that automatically detects some bug patterns that have caused
some very severe real-world failures. I am reporting some of the results here. Any feedback
is much appreciated!
> Ding
> =========================
> Case 1:
>   Line: 134, File: "org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java"
> {noformat}
>   protected void releaseTableLock() {
>     if (this.tableLock != null) {
>       try {
>         this.tableLock.release();
>       } catch (IOException ex) {
>         LOG.warn("Could not release the table lock", ex);
>         //TODO: if we get here, and not abort RS, this lock will never be released
>       }
>     }
> {noformat}
> The lock is not released if the exception occurs, causing potential deadlock or starvation.
> Similar code pattern can be found at:
>   Line: 135, File: "org/apache/hadoop/hbase/regionserver/SplitRequest.java"
> ==========================================
> =========================
> Case 2:
>   Line: 252, File: "org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java"
> {noformat}
>     try {
>       Field fEnd = SequenceFile.Reader.class.getDeclaredField("end");
>       fEnd.setAccessible(true);
>       end = fEnd.getLong(this.reader);
>     } catch(Exception e) { /* reflection fail. keep going */ }
> {noformat}
> The caught Exception seems to be too general.
> While reflection-related errors might be harmless, the try block can throw
> other exceptions including "SecurityException", "IllegalAccessException", etc. Currently
> all those exceptions are ignored. Maybe
> the safe way is to ignore the specific reflection-related errors while logging and
> handling other types of unexpected exceptions.
> ==========================================
> =========================
> Case 3:
>   Line: 148, File: "org/apache/hadoop/hbase/HBaseConfiguration.java"
> {noformat}
>     try {
>       if (Class.forName("org.apache.hadoop.conf.ConfServlet") != null) {
>         isShowConf = true;
>       }
>     } catch (Exception e) {
>     }
> {noformat}
> Similar to the previous case, the exception handling is too general. While ClassNotFound
error might be the normal case and ignored, Class.forName can also throw other exceptions
(e.g., LinkageError) under some unexpected and rare error cases. If that happens, the error
will be lost. So maybe change it to below:
> {noformat}
>     try {
>       if (Class.forName("org.apache.hadoop.conf.ConfServlet") != null) {
>         isShowConf = true;
>       }
>     } catch (LinkageError e) {
>       LOG.warn("..");
>       // handle linkage error
>     } catch (ExceptionInInitializerError e) {
>       LOG.warn("..");
>       // handle Initializer error
>     } catch (ClassNotFoundException e) {
>      LOG.debug("..");
>      // ignore
>     }
> {noformat}
> ==========================================
> =========================
> Case 4:
>   Line: 163, File: "org/apache/hadoop/hbase/client/Get.java"
> {noformat}
>   public Get setTimeStamp(long timestamp) {
>     try {
>       tr = new TimeRange(timestamp, timestamp+1);
>     } catch(IOException e) {
>       // Will never happen
>     }
>     return this;
>   }
> {noformat}
> Even if the IOException never happens right now, is it possible to happen in the future
due to code change?
> At least there should be a log message. The current behavior is dangerous since if the
exception ever happens
> in any unexpected scenario, it will be silently swallowed.
> Similar code pattern can be found at:
>   Line: 300, File: "org/apache/hadoop/hbase/client/Scan.java"
> ==========================================
> =========================
> Case 5:
>   Line: 207, File: "org/apache/hadoop/hbase/util/JVM.java"
> {noformat}
>    if (input != null){
>         try {
>           input.close();
>         } catch (IOException ignored) {
>         }
>       }
> {noformat}
> Any exception encountered in close is completely ignored, not even logged.
> In particular, the same exception scenario was handled differently in other methods in
the same file:
> Line: 154, same file
> {noformat}
>        if (in != null){
>          try {
>            in.close();
>          } catch (IOException e) {
>            LOG.warn("Not able to close the InputStream", e);
>          }
>        }
> {noformat}
> Line: 248, same file
> {noformat}
>       if (in != null){
>         try {
>           in.close();
>         } catch (IOException e) {
>           LOG.warn("Not able to close the InputStream", e);
>         }
>       }
> {noformat}
> ==========================================
> =========================
> Case 6: empty handler for exception: java.io.IOException
>   Line: 312, File: "org/apache/hadoop/hbase/rest/RowResource.java"
> {noformat}
>     } finally {
>       if (table != null) try {
>         table.close();
>       } catch (IOException ioe) { }
>     }
> {noformat}
> IOException is completely ignored. This behavior is inconsistent with the same
> code snippet at line 249 in the same file, where the IOExceptions was logged:
> {noformat}
>    } finally {
>       if (table != null) try {
>         table.close();
>       } catch (IOException ioe) {
>         LOG.debug("Exception received while closing the table", ioe);
>       }
>     }
> {noformat}
> ==========================================
> =========================
> Case 7:
>   Line: 95, File: "org/apache/hadoop/hbase/master/handler/EnableTableHandler.java"
> {noformat}
>         try {
>           this.assignmentManager.getZKTable().removeEnablingTable(tableName, true);
>         } catch (KeeperException e) {
>           // TODO : Use HBCK to clear such nodes
>           LOG.warn("Failed to delete the ENABLING node for the table " + tableName
>               + ".  The table will remain unusable. Run HBCK to manually fix the problem.");
>         }
> {noformat}
> The log message in the exception handler and the comment seem to suggest that such nodes
should be cleared using HBCK.
> ==========================================
> =========================
> Case 8:
>   Line: 463, File: "org/apache/hadoop/hbase/client/ClientScanner.java"
> {noformat}
>         try {
>           this.caller.callWithRetries(callable);
>         } catch (IOException e) {
>           // We used to catch this error, interpret, and rethrow. However, we
>           // have since decided that it's not nice for a scanner's close to
>           // throw exceptions. Chances are it was just an UnknownScanner
>           // exception due to lease time out.
>         }
> {noformat}
> Currently the handler is empty because it may be caused by "just an UnknownScanner exception".
> But what if it has other causes? Maybe the catch block can differentiate the exception
into
> different causes, ignoring the ones caused by UnknownScanner while handle others differently?
> ==========================================



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message