hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Helmling <ghelml...@gmail.com>
Subject Re: modular build and pluggable rpc
Date Tue, 31 May 2011 22:47:02 GMT
On Tue, May 31, 2011 at 1:22 PM, Stack <stack@duboce.net> wrote:

> On Mon, May 30, 2011 at 9:55 PM, Eric Yang <eyang@yahoo-inc.com> wrote:
> > Maven modulation could be enhanced to have a structure looks like this:
> >
> > Super POM
> >  +- common
> >  +- shell
> >  +- master
> >  +- region-server
> >  +- coprocessor
> >
> > The software is basically group by processor type (role of the process)
> and a shared library.
> >
> I'd change the list above.  shell should be client and perhaps master
> and regionserver should be both inside a single 'server' submodule.
> We need to add security in there.  Perhaps we'd have a submodule for
> thrift, avro, rest (and perhaps rest war file)?  (Is this too many
> submodules  -- I suppose once we are submodularized, adding new ones
> is trivial.  Its the initial move to submodules that is painful)
I'd be in favor of starting simply as well.  Something like:

- common
- client
- server
- security

or even combine the "common" bits just in to "client".  I agree thrift, avro
and rest would make perfect module candidates as well, but I don't feel
particularly strongly about them myself.  I also don't really see the
coprocessor framework as a separate module.  It's more like part of the
server infrastructure.

HTTP/REST is one good option to have (among many) as an application
interface to HBase.  But I'm skeptical of it's applicability as an internal
RPC transport.  Personally, I think we need a well defined (but still
performant) serialization format to better support cross-version operation
and alternate clients such as asynchbase.  The actual RPC framework we use
(from Hadoop) may not be perfect, but it's seen a lot of profiling and it's
threading model seems to perform pretty well for HBase workloads with
long-lived connections.

The current framework also continues to evolve, with some recent effort to
work in asynchronous handling on the server-side.  And in addition we have
full support for security via Kerberos and token-based DIGEST-MD5
authentication in a separate branch.  I'm personally not really interested
in repeating the work to incorporate security over a new HTTP based stack.
I think I'd need some convincing that an HTTP transport would perform better
than what we have.  I'm more inclined to go an evolutionary route in
improving our current stack.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message