hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rose, Joseph" <Joseph.R...@childrens.harvard.edu>
Subject Re: Status of Huawei's 2' Indexing?
Date Mon, 16 Mar 2015 15:38:02 GMT
Michael,

I don’t understand the invective. I’m sure you have something to
contribute but when bring on this tone the only thing I hear are the snide
comments.


-j


P.s., I’ll refer you to this: https://hbase.apache.org/book.html#_joins


On 3/16/15, 11:15 AM, "Michael Segel" <michael_segel@hotmail.com> wrote:

>You’ll have to excuse Andy.
>
>He’s a bit slow.  HBASE-13044 should have been done 2 years ago. And it
>was trivial. Just got done last month….
>
>But I digress… The long story short…
>
>HBASE-9203 was brain dead from inception.  Huawei’s idea was to index on
>the region which had two problems.
>1) Complexity in that they wanted to keep the index on the same region
>server
>2) Joins become impossible.  Well, actually not impossible, but
>incredibly slow when compared to the alternative.
>
>You really should go back to the email chain.
>Their defense (including Salesforce who was going to push this approach)
>fell apart when you asked the simple question on how do you handle joins?
>
>That’s their OOPS moment. Once you start to understand that, then
>allowing the index to be orthogonal to the base table, things started to
>come together. 
>
>In short, you have a query either against a single table, or if you’re
>doing a join.  You then get the indexes and assuming that you’re only
>using the AND predicate, its a simple intersection of the index result
>sets. (Since the result sets are ordered, its relatively trivial to walk
>through and find the intersections of N Lists in a single pass.)
>
>
>Now you have your result set of base table row keys and you can work with
>that data. (Either returning the records to the client, or as input to a
>map/reduce job. 
>
>That’s the 30K view.  There’s more to it, but once Salesforce got the
>basic idea, they ran with it. It was really that simple concept that the
>index would be orthogonal to the base table that got them moving in the
>right direction. 
>
>
>To Joseph’s point, indexing isn’t necessarily an RDBMS feature. However,
>it seems that some of the Committers are suffering from rectal induced
>hypoxia. HBASE-12853 was created not just to help solve the issue of ‘hot
>spotting’ but also to get the Committers to focus on bringing the
>solutions that they glum on in the client, back to the server side of
>things. 
>
>Unfortunately the last great attempt at fixing things on the server side
>was the bastardization of coprocessors which again, suffers from the lack
>of thought.  This isn’t to say that allowing users to extend the server
>side functionality is wrong. (Because it isn’t.) But that the
>implementation done in HBase is a tad lacking in thought.
>
>So in terms of indexing…
>Longer term picture, there has to be some fixes on the server side of
>things to allow one to associate an index (allowing for different types)
>to a base table, yet the implementation of using the index would end up
>becoming a client.  And by client, it would be an external query engine
>processor that could/should sit on the cluster.
>
>But hey! What do I know?
>I gave up trying to have an intelligent/civilized conversation with
>Andrew because he just couldn’t grasp the basics.  ;-)
>
>
>
>
>
>> On Mar 13, 2015, at 4:14 PM, Andrew Purtell <apurtell@apache.org> wrote:
>> 
>> When I made that remark I was thinking of a recent discussion we had at
>>a
>> joint Phoenix and HBase developer meetup. The difference of opinion was
>> certainly civilized. (smile) I'm not aware of any specific written
>> discussion, it may or may not exist. I'm pretty sure a revival of
>>HBASE-9203
>> would attract some controversy, but let me be clearer this time than I
>>was
>> before that this is just my opinion, FWIW.
>> 
>> 
>> On Thu, Mar 12, 2015 at 3:58 PM, Rose, Joseph <
>> Joseph.Rose@childrens.harvard.edu> wrote:
>> 
>>> I saw that it was added to their project. I’m really not keen on
>>>bringing
>>> in all the RDBMS apparatus on top of hbase, so I decided to follow
>>>other
>>> avenues first (like trying to patch 0.98, for better or worse.)
>>> 
>>> That Phoenix article seems like a good breakdown of the various
>>>indexing
>>> architectures.
>>> 
>>> HBASE-9203 (the ticket that deals with 2’ indexes) is pretty civilized
>>>(as
>>> are most of them, it seems) so I didn’t know there were these
>>>differences
>>> of opinion. Did I miss the mailing list thread where the architectural
>>> differences were discussed?
>>> 
>>> 
>>> -j
>
>The opinions expressed here are mine, while they may reflect a cognitive
>thought, that is purely accidental.
>Use at your own risk.
>Michael Segel
>michael_segel (AT) hotmail.com
>
>
>
>
>

Mime
View raw message