hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2742) Replace forked HBase RPC with Hadoop RPC
Date Thu, 17 Jun 2010 16:25:25 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879831#action_12879831
] 

Todd Lipcon commented on HBASE-2742:
------------------------------------

+1 for updating our copypaste with security.
Regarding the OOME swallowing, it's done so that sending a malformed packet to a server doesn't
crash it. It's way too easy to craft a packet (on purpose or by accident) that has a 100GB
"length" header in it, and cause OOME.

> Replace forked HBase RPC with Hadoop RPC
> ----------------------------------------
>
>                 Key: HBASE-2742
>                 URL: https://issues.apache.org/jira/browse/HBASE-2742
>             Project: HBase
>          Issue Type: Improvement
>          Components: ipc
>            Reporter: Gary Helmling
>
> The HBase RPC code (org.apache.hadoop.hbase.ipc.*) was originally forked off of Hadoop
RPC classes, with some performance tweaks added.  Those optimizations have come at a cost
in keeping up with Hadoop RPC changes however, both bug fixes and improvements/new features.
 
> In particular, this impacts how we implement security features in HBase (see HBASE-1697
and HBASE-2016).  The secure Hadoop implementation (HADOOP-4487) relies heavily on RPC changes
to support client authentication via kerberos and securing and mutual authentication of client/server
connections via SASL.  Making use of the built-in Hadoop RPC classes will gain us these pieces
for free in a secure HBase.
> So, I'm proposing that we drop the HBase forked version of RPC and convert to direct
use of Hadoop RPC, while working to contribute important fixes back upstream to Hadoop core.
 Based on a review of the HBase RPC changes, the key divergences seem to be:
> HBaseClient:
>  - added use of TCP keepalive (HBASE-1754)
>  - made connection retries and sleep configurable (HBASE-1815)
>  - prevent NPE if socket == null due to creation failure (HBASE-2443)
> HBaseRPC:
>  - mapping of method names <-> codes (removed in HBASE-2219)
> HBaseServer:
>  - use of TCP keep alives (HBASE-1754)
>  - OOME in server does not trigger abort (HBASE-1198)
> HbaseObjectWritable:
>  - allows List<> serialization
>  - includes it's own class <-> code mapping (HBASE-328)
> Proposed process is:
> 1. open issues with patches on Hadoop core for important fixes/adjustments from HBase
RPC (HBASE-1198, HBASE-1815, HBASE-1754, HBASE-2443, plus a pluggable ObjectWritable implementation
in RPC.Invocation to allow use of HbaseObjectWritable).
> 2. ship a Hadoop version with RPC patches applied -- ideally we should avoid another
copy-n-paste code fork, subject to ability to isolate changes from impacting Hadoop internal
RPC wire formats
> 3. if all Hadoop core patches are applied we can drop back to a plain vanilla Hadoop
version
> I realize there are many different opinions on how to proceed with HBase RPC, so I'm
hoping this issue will kick off a discussion on what the best approach might be.  My own motivation
is maximizing re-use of the authentication and connection security work that's already gone
into Hadoop core.  I'll put together a set of patches around #1 and #2, but obviously we need
some consensus around this to move forward.  If I'm missing other differences between HBase
and Hadoop RPC, please list as well.  Discuss!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message