hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: HDFS-1599 status? (HDFS tickets to improve HBase)
Date Sun, 05 Jun 2011 22:40:47 GMT
On Sat, Jun 4, 2011 at 1:46 AM, Andrew Purtell <apurtell@apache.org> wrote:

> This is not discouraging. :-)
> HBasers patch CDH because trunk -- anything > 0.20 actually -- is not
> trusted by consensus if you look at all of the production deployments. Does
> ANYONE run trunk under anything approaching "production"? And trunk/upstream
> has a history of ignoring any HBase specific concern. So the use of and
> trading of patches will probably continue for a while, maybe forever.

Right - I wasn't suggesting that you run trunk in production as of yet. But
there has been very little activity in terms of HBase people running trunk
in dev/test clusters in the past. Stack has done some awesome work here in
the last few weeks, so that should open it up for some more people to jump
on board.

I agree that HBase has been treated as a second-class citizen in recent
years from HDFS's performance, but I think that has changed. All of the
major HDFS contributors now have serious stakes in HBase, and so long as
there are tests with sufficient testing that apply against trunk, I don't
see a reason they wouldn't be included.

> Part of the problem is the expectation that any patch provided against
> trunk may generate months of back and forth, as we have seen, which presents
> difficulities to a potential contributor who does not work on e.g. HDFS
> matters full time. Alternatively it may pick up a committer as sponsor and
> then be vetoed by Yahoo because they're mad at Cloudera over some unrelated
> issue and a patch appears to have a Cloudera sponsor and/or or vice versa.
> Now, that situation I describe _is_ discouraging. It's not enough to say
> that we must contribute through trunk. Trunk needs to earn back our trust.

Yes, there have been some unfortunate things in the past. There have also
been some half-finished or untested patches proposed, and you can't blame
HDFS folks for not taking a big patch that doesn't have a lot of confidence
behind it.

I've been thinking about this this afternoon, and have an idea. It may prove
to be an awful one, but maybe it's a good one, only time will tell :) I'll
create a branch off of HDFS trunk specifically for HBase performance work.
We can commit these "90% done" patches there, which will make it easier for
others to test and gain confidence. Branches also can make it easier to
maintain patches over time with a changing trunk.

How does this sound to the HBase community? If it seems like a good idea,
*and* there are some people who would be willing to set it up on some small
dev clusters and run load tests, I'll move forward with it.

> I believe I recently saw discussion that append should be removed or
> disabled by default on 0.22 or trunk. Did you see anything like this? If I
> am mistaken, fine. If not, this is going in the wrong direction, for
> example.

Not sure what you're referring to - I don't remember any discussion like

Todd Lipcon
Software Engineer, Cloudera

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message