hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jtay...@salesforce.com>
Subject Re: Design review: Secondary index support through coprocess
Date Thu, 09 Jan 2014 16:48:03 GMT
IMHO, it would be valuable if the design considered both a global
indexing solution and a local indexing solution. Both are useful in
different circumstances. The global indexing design plus the
application integration points could be derived from Jesse's work with
his reference implementation in Phoenix - the global indexing code has
no Phoenix dependencies and clearly defined integration points.


On Jan 9, 2014, at 6:36 AM, Jesse Yates <jesse.k.yates@gmail.com> wrote:

> Yes, that was a big concern I had as well.
> It's not clear how that will work with a large number of indexes; if people
> have one index, they will want more than one. To not plan for that seems
> like an incomplete implementation to me. In a horizontally scalable system
> like HBase, lots of buddy region isn't going to work out well..* Once we
> have regions that cannot be collocated, the extra RPC time starts to be the
> biggest factor (as the doc points out) and we are back to what Phoenix is
> already doing**.
> But I'm probably missing something here in what makes it different?
> For folks that haven't been following the issue some high-level "how it all
> kinda works" would be helpful from the championing commiters; that's a long
> doc to get through and grok :). How similar is this to the work currently
> by the existing indexing implementations (huawei, Phoenix, ngdata)? The doc
> doesn't really nail down the interactions, but instead just right in after
> describing why SI should be added.
> Agree this would be super useful, but don't want to waste too much work
> reinventing the wheel or doing the wrong thing. further, this impl quickly
> starts to lead down the query optimization path, which get HBase away from
> its core "be a great byte store".
> Like I said, I'm all for secondary indexes in HBase and think this is a
> great push. I don't mean to rain on any parades.
> - jesse
> * but a smart way to specify region collocation? That I can get behind as
> it would unify a couple different indexing impls (e.g Phoenix would
> consider using it to help make indexing faster - RPCs do suck).
> ** for instance, the doc talks about how to implement indexing for
> floats... That might be a default impl, but for use cases like Phoenix this
> would break all our current encodings. We handled this is the indexing impl
> by making the builder pluggable for different use cases to support
> different encodings. I feel like a lot of the code for this kind of SI
> impl is already in Phoenix and has been working and fast for several months
> now; it's surprisingly tricky, especially with the delete cases and time
> stamp manipulation issues.
> On Thursday, January 9, 2014, Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN)
> wrote:
>> Could you explain how the 1-1 association between user and index table
>> regions is maintained. I wasn't able to understand fully from the document.
>> ----- Original Message -----
>> From: Ted Yu <dev@hbase.apache.org>
>> To: dev@hbase.apache.org
>> At: Jan 8, 2014 3:41:40 PM
>> Hi,
>> Secondary index support is a frequently requested feature.
>> Please find the updated design doc here:
>> https://issues.apache.org/jira/secure/attachment/12621909/SecondaryIndex%20Design_Updated_2.pdf
>> HBASE-9203 is the umbrella JIRA.
>> Implementation patch was attached to HBASE-10222
>> Thanks to Rajesh who works on this feature.
>> Cheers
> --
> -------------------
> Jesse Yates
> @jesse_yates
> jyates.github.com

View raw message