hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4014) Coprocessors: Flag the presence of coprocessors in logged exceptions
Date Tue, 06 Sep 2011 19:11:17 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13098279#comment-13098279
] 

jiraposter@reviews.apache.org commented on HBASE-4014:
------------------------------------------------------



bq.  On 2011-08-12 23:46:30, Gary Helmling wrote:
bq.  > Nice work, Eugene.  I think we're getting close.  Just two suggested improvements
below.
bq.  > 
bq.  > The main question still open to debate, I think, is whether or not aborting the
server on unhandled exceptions is appropriate.
bq.  > 
bq.  > On the one hand, aborting takes the fail-fast approach and makes buggy coprocessors
much more visible.  It's a lot more likely that a bug will be noticed and fixed if it brings
down a region server!
bq.  > 
bq.  > On the other hand, I think coprocessors already pose enough of a stability risk
to a cluster.  I think we should be working to minimize that by containing the impact that
a buggy coprocessor can have.  If they coprocessor really wants or needs to trigger an abort,
it can already do so, since (Master|RegionServer)Services extend Server, which extends Abortable.
bq.  > 
bq.  > I think I'd be more in favor of removing the coprocessor from the active set (we
should make this as visible as possible so it's clear the coprocessor is no longer "active"),
or at least wrapping the exception in a DoNotRetryIOException and communicating it back to
the client?  Maybe both?
bq.  > 
bq.  > I guess I'd be okay with a configuration option to abort on error (I think a single
config option is sufficient), as long as it's disabled by default.  But that would still imply
we need some other handling when the option is disabled.
bq.  
bq.  Michael Stack wrote:
bq.      I like Gary's reasoning here.

Thanks Gary and Michael for your comments. Latest patch defines a new config option hbase.coprocessor.abort_on_error,
which defaults to false. 

CoprocessorHost#handleCoprocessorThrowable() removes the buggy coprocessor from its active
set of coprocesors.

As Gary said, a coprocessor author can explicitly code their coprocessor to abort if they
want to. On the other hand, coprocessor authors might like to develop or test without having
to explicitly take care of this, so perhaps hbase.coprocessor.abort_on_error=true would be
useful for them to define.


bq.  On 2011-08-12 23:46:30, Gary Helmling wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java, line 71
bq.  > <https://reviews.apache.org/r/969/diff/6/?file=31373#file31373line71>
bq.  >
bq.  >     I would just synchronize the set here:
bq.  >     
bq.  >     Set<String> coprocessorNames = Collections.synchronizedSet(new HashSet<String>());

Fixed, thanks.


bq.  On 2011-08-12 23:46:30, Gary Helmling wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java, line 106
bq.  > <https://reviews.apache.org/r/969/diff/6/?file=31373#file31373line106>
bq.  >
bq.  >     If you move this into loadInstance() then you don't have to duplicate it elsewhere,
since all the other load methods wind up calling that.

moved to loadInstance(), thanks.


- Eugene


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/969/#review1433
-----------------------------------------------------------


On 2011-09-06 19:08:59, Eugene Koontz wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/969/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-09-06 19:08:59)
bq.  
bq.  
bq.  Review request for hbase, Gary Helmling and Mingjie Lai.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  https://issues.apache.org/jira/browse/HBASE-4014 Coprocessors: Flag the presence of coprocessors
in logged exceptions
bq.  
bq.  The general gist here is to wrap each of {Master,RegionServer}CoprocessorHost's coprocessor
call inside a 
bq.  
bq.  "try { ... } catch (Throwable e) { handleCoprocessorThrowable(e) }"
bq.  
bq.  block. 
bq.  
bq.  handleCoprocessorThrowable() is responsible for either passing 'e' along to the client
(if 'e' is an IOException) or, otherwise, aborting the service (Regionserver or Master).
bq.  
bq.  The abort message contains a list of the loaded coprocessors for crash analysis.
bq.  
bq.  
bq.  This addresses bug HBASE-4014.
bq.      https://issues.apache.org/jira/browse/HBASE-4014
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 4e492e1 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java 3f60653 
bq.    src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java aa930f5 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 8ff6e62 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java 5796413

bq.    src/main/resources/hbase-default.xml 2c8f44b 
bq.    src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorException.java
PRE-CREATION 
bq.    src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorException.java
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/969/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  patch includes two tests:
bq.  
bq.  TestMasterCoprocessorException.java
bq.  TestRegionServerCoprocessorException.java
bq.  
bq.  both tests pass in my build environment.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Eugene
bq.  
bq.



> Coprocessors: Flag the presence of coprocessors in logged exceptions
> --------------------------------------------------------------------
>
>                 Key: HBASE-4014
>                 URL: https://issues.apache.org/jira/browse/HBASE-4014
>             Project: HBase
>          Issue Type: Improvement
>          Components: coprocessors
>            Reporter: Andrew Purtell
>            Assignee: Eugene Koontz
>             Fix For: 0.92.0
>
>         Attachments: HBASE-4014.patch, HBASE-4014.patch, HBASE-4014.patch, HBASE-4014.patch,
HBASE-4014.patch
>
>
> For some initial triage of bug reports for core versus for deployments with loaded coprocessors,
we need something like the Linux kernel's taint flag, and list of linked in modules that show
up in the output of every OOPS, to appear above or below exceptions that appear in the logs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message