hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <la...@apache.org>
Subject Re: [ANNOUNCE] Secondary Index in HBase - from Huawei
Date Thu, 15 Aug 2013 01:10:08 GMT
No it doesn't in case 2 below. Quite the opposite.

 From: Michael Segel <michael_segel@hotmail.com>
To: dev@hbase.apache.org; lars hofhansl <larsh@apache.org> 
Sent: Wednesday, August 14, 2013 4:57 PM
Subject: Re: [ANNOUNCE] Secondary Index in HBase - from Huawei

Its not a question of their solution not working. 
Its that it takes a lot more resources on the read than the alternative. 

On Aug 14, 2013, at 5:35 PM, lars hofhansl <larsh@apache.org> wrote:

> Yep.
> 1. highly selective indexes + point gets -> global inverted index tables
> 2. less selective indexes + queries returning many rows -> "local" indexes, such as
the Huawei solution.
> Of course it's not quite that black and white. Global indexes that serve index covered
queries (where the query can be answered from the index alone) would also work in many cases
of non-selective queries.
> In the end it is quite simple (IMHO):
> If a query retrieves data from only a single region, you want to able to hone into that
region quickly, via a piece of global information.
> If on the other hand a query returns data from many regions, you're better off handling
the filtering locally.
> Just my $0.02.
> -- Lars
> ________________________________
> From: Andrew Purtell <apurtell@apache.org>
> To: "dev@hbase.apache.org" <dev@hbase.apache.org> 
> Sent: Wednesday, August 14, 2013 1:52 PM
> Subject: Re: [ANNOUNCE] Secondary Index in HBase - from Huawei
> On Wed, Aug 14, 2013 at 8:45 AM, Michael Segel <michael_segel@hotmail.com>wrote:
>> This isn't too bad if you're doing a simple query against one index. You
>> can do the work by RS and then join the results from all RS.
>> However… what happens if you have two indexes and your result set is going
>> to be the intersection of the indexes?
>> Or you're going to do a join between two tables using the indexes to limit
>> the result set?
>> Now your design breaks down quickly.
> You may have just described their design assumptions.
> I'm not endorsing this per se, but suggesting it is not a good idea on
> account it can't live up to the requirements of a pretty particular
> strawman seems a step too far.
> Maybe someone from Huawei can talk a bit here about successful use cases?
>> You could also look at Lucene which we did a PoC a few years back.
> A certain large technology company has an HBase full text index built on
> Lucene that might be offered as a contribution at some point. From what I
> know of it, there are a different set of tradeoffs and it certainly won't
> work for everyone, and not because the people working on it were not smart
> enough to find a silver bullet.
> -- 
> Best regards,
>    - Andy
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)

The opinions expressed here are mine, while they may reflect a cognitive thought, that is
purely accidental. 
Use at your own risk. 
Michael Segel
michael_segel (AT) hotmail.com
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message