hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Gray <jg...@facebook.com>
Subject RE: [DISCUSS] HBase as Apache top-level project?
Date Thu, 18 Mar 2010 19:09:53 GMT
I would like to see HBase support alternative filesystems in the future.  There have been talks
of other up and coming DFSs that were built more for random access that might make sense for
some use cases.  I imagine a time down the road where there would be a choice of DFS depending
on a particular use case.

Users coming from the Hadoop world who would be utilizing both and likely be more tuned towards
analytics would just add HBase atop Hadoop.  Someone coming from a relational database who
is interested in fast read/write random access might be able to choose a DFS more closely
suited to that use case.  Hopefully HDFS gets better at this so it could be the leader across
the board, but I don't think we should necessarily be married to it.  Besides possible differences
in append APIs, in general, it should not be difficult to plug a different DFS in (and it's
been done in the past with kfs).

While it would be nice if active HBase committers were eventually made into Hadoop PMC committers,
to this point this has not happened (I believe stack was already on Hadoop PMC when HBase
become a sub-project).  When we want to add a new committer we now have to build a case to
people who actually have no community insight rather than allowing our community (which I
believe is big enough to support itself) to make their own decisions.

Also, I've not seen Stack's presence on the Hadoop PMC in any way contribute to the likelihood
of an HDFS patch getting committed.

That being said, we would not want to create any bad blood w/ the Hadoop community.  Dhruba,
do you think that is a risk?

JG

> -----Original Message-----
> From: Dhruba Borthakur [mailto:dhruba@gmail.com]
> Sent: Thursday, March 18, 2010 11:08 AM
> To: hbase-dev@hadoop.apache.org
> Subject: Re: [DISCUSS] HBase as Apache top-level project?
> 
> Hi Stack,
> 
> Can HBase (in theory) be used on filesystems/MR other than Hadoop?
> 
> I see one primary disadvantage of moving away from the Hadoop project.
> Please let me explain. In the Hadoop world, if a committer is actively
> contributing code, she/he becomes part of the Hadoop PMC. This means
> that
> Hbase active hbase committers would (over time) become Hadoop PMC
> members.
> This might allow Hbase-related fixes to get into HDFS much more easily.
> If
> HBase moves away from Hadoop, then Hbase developers will not have a
> part to
> play in guiding HDFS to make it more amenable to HBase usage.
> 
> The case is different for ZK and avro. They are not related to Hadoop
> HDFS/MR at all.
> 
> I am not voting against this proposal, just laying out my viewpoint.
> 
> thanks,
> dhruba
> 
> 
> On Thu, Mar 18, 2010 at 10:43 AM, Stack <stack@duboce.net> wrote:
> 
> > On Thu, Mar 18, 2010 at 10:15 AM, Andrew Purtell
> <apurtell@apache.org>
> > wrote:
> > >
> > > HBase is an integrated optional part of a Hadoop stack more
> > > than a standalone component, but other ASF TLPs build on top
> > > of other projects. I suppose HDFS and ZK are going to be TLPs
> > > at some point also, is that true? Leaving Hadoop as just the
> > > MR framework?
> >
> > If the board allows us be a TLP, Zookeeper would probably be made a
> > TLP at same time.
> >
> > There hasn't been a vote, but it seems that the thought is that HDFS
> > would stay within the hadoop fold; i.e. hdfs+mapreduce+common would
> > stay.
> >
> > >
> > > Anyway, what I like is HBase will stand on its own merits.
> > >
> > > What are the risks of being a TLP?
> > >
> >
> > I'm sure there are some but I'm blinded by the upside at the moment.
> >
> > St.Ack
> >
> 
> 
> 
> --
> Connect to me at http://www.facebook.com/dhruba

Mime
View raw message