hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Devaraj Das <d...@hortonworks.com>
Subject Re: What's going on? Two C++ clients being developed at the moment?
Date Tue, 19 Apr 2016 05:40:52 GMT
Elliott, on the point about sync versus async, as I said earlier, that's an area we are looking
at, and we'd welcome contributions from you very much as well. But at the same time, I don't
want the perfect to be the enemy of the good. If something is working well in the sync mode,
why would we block it.

On your point about configuration, I don't see why we shouldn't be providing a Java-client
compatible configuration story. You raise the point about pure C++ clients. I say that we
have cases where we have deployment tools (like Ambari) deploy configurations and it'll be
better to not introduce a C++ configuration for them to deploy. But again this is an area
that we could look at as a pluggable thing..

On your point about Support, we have customers waiting to consume the library. Even Facebook
is looking at using HBase C++ client I thought? If there are users for a feature, it shouldn't
rot otherwise we are not doing a good job of taking care of our users, right? But agree that
we have to careful about this one given this is an entire new implementation of a large piece
of software - we should look at what we could do better here w.r.t code rot.

On your point about Copyright/Authors/GNU-license-stuff, Vamsi has responded to such comments
from you on RB already, I believe. In any case, if there are issues around these topics, we
will address them as a top priority. (Putting my vendor hat, as a vendor of open-source software,
we will have to do this before we can ship this in any HDP release).

________________________________________
From: Elliott Clark <eclark@apache.org>
Sent: Monday, April 18, 2016 8:41 PM
To: dev@hbase.apache.org
Subject: Re: What's going on? Two C++ clients being developed at the moment?

On Mon, Apr 18, 2016 at 4:57 PM, Devaraj Das <ddas@hortonworks.com> wrote:

> 1. Be able to do async RPC. Vamsi's current implementation does sync. I
> know Vamsi is already looking at that aspect. Just to be clear, this in my
> mind this doesn't qualify as a blocker - since we don't have an async
> client API to support yet.
>

There's no deadline on creating a good client. We should do what's better
not what's fastest to implement. We know that the way HBase is used results
in way too many threads if everything is sync. There's no prize for getting
things done before they are the best possible. We should be learning from
our many many mistakes in the first client and creating something better.
We shouldn't be re-creating the same exact mess in a different language.

There's no backwards compat rules, so we should make what works the best
for HBase going forward. And as clusters get larger and larger one thread
per server or per request is not going to scale. We're already in a place
that the number of threads is a factor. Just imagine as clusters continue
to grow. Locking and context switches will abound.


> 2. Switch to C++ ways of configuring the hbase client configuration. This
> is something I am really not sure about. By going this route, we'd have to
> be able to manage two different ways of configuring things - one for Java
> and another for C++. This will lead to unnecessary duplication of configs
> and such (and the deployment tools would now have to be aware about a new
> way of configuring c++ clients). But we can take a look at making this
> configuration method pluggable if it makes sense.
>

 We should do what is best for each use case. You don't see xml going well
with python, ruby, javascript configs. If the native client is going to be
used by people outside of the java world, then we should give them the
experience that delights them. Not that makes them think about java. None
of the best designed libraries in cpp that I can think of use an xml file
for a config.

Elliott, regarding your concern as to who would support the large code drop
> .. we did talk about breaking the patch up into smaller ones if it makes
> sense for reviews and such.
>

Splitting it up is nice, and shows that software is being developed in the
open, and not code dropped. However my concern is less about the size of
the patch and more about the body of work being supported. HBase has had a
bad history of taking code, having no one support it, and then having bit
rot in our repository.


> I think Vamsi has already addressed the other concerns to do with
> Copyright headers, etc.
>

Sorry this hasn't been addressed at all.
There are copyright headers from other projects (install-sh, missing).
Importing some code from other projects seems dubious at best. And at worst
seems like copy paste. That could come back and bite us. Once trust in a
code's origin is lost it's really hard to integrate it. I don't want to
have to go through every line and see where it came from.

There's other people's names in the code. Everyone who wrote the code needs
to assign copyright to Apache. Saying "I worked with someone else, I can
remove their name" is not fixing the issue.

There's GNU licensed code. GNU is specifically called out as not being able
to be in our repos.

None of these are addressed and to be honest they scare me. We can move
fast and import code that's not 100% working or tested. But playing fast
and loose with laws is just another thing entirely.

Mime
View raw message