hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <andrew.purt...@gmail.com>
Subject Re: HBASE-2182
Date Sat, 30 Jun 2012 01:10:50 GMT
I worry it's more complicated than that given nobody seems to have done it, at least... "netty
SASL" or "netty wrap SASL" or "netty SASL socket" turns up paltry results in a Google search.
Avro considered it but didn't. We considered it for Zookeeper but didn't. (Excluded very early
due to ZK authentication design particulars though.)

    - Andy

On Jun 29, 2012, at 5:59 PM, Elliott Clark <eclark@stumbleupon.com> wrote:

> Sorry I only alluded to it in the bullet point about the filter model.  I
> would imagine that as a (or two) filter in the channel stack.  It's
> honestly something that I haven't gotten to looking at in-depth yet.
> On Fri, Jun 29, 2012 at 5:34 PM, Andrew Purtell <andrew.purtell@gmail.com>wrote:
>> Without SASL/krb/security integration with the rest of Hadoop this would
>> be a nonstarter for us. I didn't see that mentioned?
>> On Jun 29, 2012, at 5:04 PM, Todd Lipcon <todd@cloudera.com> wrote:
>>> A few inline notes below:
>>> On Fri, Jun 29, 2012 at 4:42 PM, Elliott Clark <eclark@stumbleupon.com
>>> wrote:
>>>> I just posted a pretty early skeleton(
>>>> https://issues.apache.org/jira/browse/HBASE-2182) on what I think a
>> netty
>>>> based hbase client/server could look like.
>>>> Pros:
>>>> - Faster
>>>>    - Giraph got a 3x perf improvement by droppping hadoop rpc
>>> Whats the reference for this? The 3x perf I heard about from Giraph was
>>> from switching to using LMAX's Disruptor instead of queues, internally.
>> We
>>> could do the same, but I'm not certain the model works well for our use
>>> cases where the RPC processing can end up blocked on disk access, etc.
>>>>    - Asynhbase trounces our client when JD benchmarked them
>>> I'm still convinced that the majority of this has to do with the way our
>>> batching happens to the server, not async vs sync. (in the current sync
>>> client, once we fill up the buffer, we "flush" from the same thread, and
>>> block the flush until all buffered edits have made it, vs doing it in the
>>> background). We could fix this without going to a fully async model.
>>>> - Could encourage things to be a little more modular if everything
>> isn't
>>>> hanging directly off of HRegionServer
>>> Sure, but not sure I see why this is Netty vs not-Netty
>>>> - Netty is better about thread usage than hadoop rpc server.
>>> Can you explain further?
>>>> - Pretty easy to define an rpc protocol after all of the work on
>>>> protobuf (Thanks everyone)
>>>> - Decoupling the rpc server library from the hadoop library could allow
>>>> us to rev the server code easier.
>>>> - The filter model is very easy to work with.
>>>>    - Security can be just a single filter.
>>>>    - Logging can ba another
>>>>    - Stats can be another.
>>>> Cons:
>>>> - Netty and non apache rpc server's don't play well togther.  They
>> might
>>>> be able to but I haven't gotten there yet.
>>> What do you mean "non apache rpc servers"?
>>>> - Complexity
>>>>    - Two different servers in the src
>>>>    - Confusing users who don't know which to pick
>>>> - Non-blocking could make the client a harder to write.
>>>> I'm really just trying to gauge what people think of the direction and
>> if
>>>> it's still something that is wanted.  The code is a loooooong way from
>> even
>>>> being a tech demo, and I'm not a netty expert, so suggestions would be
>>>> welcomed.
>>>> Thoughts ? Are people interested in this? Should I push this to my
>> github
>>>> so other can help ?
>>> IMO, I'd want to see a noticeable perf difference from the change -
>>> unfortunately it would take a fair amount of work to get to the point
>> where
>>> you could benchmark it. But if you're willing to spend the time to get to
>>> that point, seems worth investigating.
>>> --
>>> Todd Lipcon
>>> Software Engineer, Cloudera

View raw message