hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <lhofha...@yahoo.com>
Subject Re: HBase - Secondary Index
Date Wed, 05 Dec 2012 21:03:04 GMT
I'd say "it depends".
It seems everybody wants secondary indexes in HBase. The problem is that most folks don't
agree what that actually means.

The most interesting problem and discussion point (IMHO) is that HBase would need some form
of schema description.

See also HBASE-7221. Assuming we had something like that, it could be a building block for
a simple built-in secondary index solution.

In the end you're probably right and we cannot prescribe a single secondary index solution
that will fit all use cases.

-- Lars


----- Original Message -----
From: Jonathan Hsieh <jon@cloudera.com>
To: dev@hbase.apache.org
Cc: 
Sent: Wednesday, December 5, 2012 11:23 AM
Subject: Re: HBase - Secondary Index

I personally feel that we should only add the primitive apis necessary to
make this work (and if non are required, all the better!), and to try to
keep secondary indexing work it as a separate project on top of hbase
because there are many possible valid architectures and  implementations.

I have two question areas -- one touched upon in previous messages -- what
logging and how do we get consistency guarantee's do we get bewteen the
index and primary?

The other has to do with scalability. I'm not sure I interpreted the slides
correctly, but from the slides 8 and 13, is the architecture such that each
primary table region has a corresponding index table region?

Is slide 14 a comparison of a full table scan vs the indexed lookups on 4
rs's?  What happens if we go up to 20, or 100 rs's?  If I'm right about the
per index region per table region, I have a feeling this isn't going to
scale well with a large number of regions (since it would potentially have
to talk to each region and essentially every region server).

Jon.

On Wed, Dec 5, 2012 at 9:54 AM, Anoop John <anoop.hbase@gmail.com> wrote:

> I mean HBase devs to work on having sec indexing available with the HBase
> distribution...  Now I guess many users of HBase implement different kinds
> of sec indexing ..:)
>
> We @Huawei would be happy to provide our support in it as per the interest
> from the community.. :)
>
> -Anoop-
>
> On Wed, Dec 5, 2012 at 9:29 PM, Andrey Stepachev <octo47@gmail.com> wrote:
>
> > Can you explain, what you mean under 'HBase community version with sec
> > indexing in it'. You will wait, until someone implements the same
> algorithm
> > in trunk hbase, or?
> >
> >
> > On Wed, Dec 5, 2012 at 4:55 PM, Anoop Sam John <anoopsj@huawei.com>
> wrote:
> >
> > > No this is not open sourced yet..  As per the interest from the HBase
> > > community we can think of contributing..
> > > It is time to see HBase community version with sec indexing in it
> (IMHO)
> > >  :)
> > >
> > > -Anoop-
> > >
> > > ________________________________________
> > > From: Andrey Stepachev [octo47@gmail.com]
> > > Sent: Wednesday, December 05, 2012 5:22 PM
> > > To: dev@hbase.apache.org
> > > Subject: Re: HBase - Secondary Index
> > >
> > > Hi.
> > >
> > > Indexing solution looks tempting.
> > > Are there any plans to open source your solution (or it already open
> and
> > I
> > > can't find it?).
> > >
> > >
> > >
> > >
> > > On Tue, Dec 4, 2012 at 12:10 PM, Anoop Sam John <anoopsj@huawei.com>
> > > wrote:
> > >
> > > > Hi All
> > > >
> > > >             Last week I got a chance to present the secondary
> indexing
> > > > solution what we have done in Huawei at the China Hadoop Conference.
> >  You
> > > > can see the presentation from
> > > > http://hbtc2012.hadooper.cn/subject/track4Anoop%20Sam%20John2.pdf
> > > >
> > > >
> > > >
> > > > I would like to hear what others think on this. :)
> > > >
> > > >
> > > >
> > > > -Anoop-
> > > >
> > >
> > >
> > >
> > > --
> > > Andrey.
> > >
> >
> >
> >
> > --
> > Andrey.
> >
>



-- 
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com


Mime
View raw message