hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: Status of Huawei's 2' Indexing?
Date Mon, 16 Mar 2015 16:41:27 GMT
I don't understand the repeated mention of "Salesforce" in that invective.
As point of fact the work of adding local mutable indexes to Phoenix was
done by a contributor from Huawei, who has since moved over to Hortonworks,
if I'm not mistaken - but not like affiliation matters, it really doesn't.

As for the rest, well I've had to give up on your like and respect, but I
picked up the pieces of my life a while back after we had that falling out
over coprocessors.

On Mon, Mar 16, 2015 at 8:14 AM, Michael Segel <michael_segel@hotmail.com>

> You’ll have to excuse Andy.
> He’s a bit slow.  HBASE-13044 should have been done 2 years ago. And it
> was trivial. Just got done last month….
> But I digress… The long story short…
> HBASE-9203 was brain dead from inception.  Huawei’s idea was to index on
> the region which had two problems.
> 1) Complexity in that they wanted to keep the index on the same region
> server
> 2) Joins become impossible.  Well, actually not impossible, but incredibly
> slow when compared to the alternative.
> You really should go back to the email chain.
> Their defense (including Salesforce who was going to push this approach)
> fell apart when you asked the simple question on how do you handle joins?
> That’s their OOPS moment. Once you start to understand that, then allowing
> the index to be orthogonal to the base table, things started to come
> together.
> In short, you have a query either against a single table, or if you’re
> doing a join.  You then get the indexes and assuming that you’re only using
> the AND predicate, its a simple intersection of the index result sets.
> (Since the result sets are ordered, its relatively trivial to walk through
> and find the intersections of N Lists in a single pass.)
> Now you have your result set of base table row keys and you can work with
> that data. (Either returning the records to the client, or as input to a
> map/reduce job.
> That’s the 30K view.  There’s more to it, but once Salesforce got the
> basic idea, they ran with it. It was really that simple concept that the
> index would be orthogonal to the base table that got them moving in the
> right direction.
> To Joseph’s point, indexing isn’t necessarily an RDBMS feature. However,
> it seems that some of the Committers are suffering from rectal induced
> hypoxia. HBASE-12853 was created not just to help solve the issue of ‘hot
> spotting’ but also to get the Committers to focus on bringing the solutions
> that they glum on in the client, back to the server side of things.
> Unfortunately the last great attempt at fixing things on the server side
> was the bastardization of coprocessors which again, suffers from the lack
> of thought.  This isn’t to say that allowing users to extend the server
> side functionality is wrong. (Because it isn’t.) But that the
> implementation done in HBase is a tad lacking in thought.
> So in terms of indexing…
> Longer term picture, there has to be some fixes on the server side of
> things to allow one to associate an index (allowing for different types) to
> a base table, yet the implementation of using the index would end up
> becoming a client.  And by client, it would be an external query engine
> processor that could/should sit on the cluster.
> But hey! What do I know?
> I gave up trying to have an intelligent/civilized conversation with Andrew
> because he just couldn’t grasp the basics.  ;-)
> > On Mar 13, 2015, at 4:14 PM, Andrew Purtell <apurtell@apache.org> wrote:
> >
> > When I made that remark I was thinking of a recent discussion we had at a
> > joint Phoenix and HBase developer meetup. The difference of opinion was
> > certainly civilized. (smile) I'm not aware of any specific written
> > discussion, it may or may not exist. I'm pretty sure a revival of
> HBASE-9203
> > would attract some controversy, but let me be clearer this time than I
> was
> > before that this is just my opinion, FWIW.
> >
> >
> > On Thu, Mar 12, 2015 at 3:58 PM, Rose, Joseph <
> > Joseph.Rose@childrens.harvard.edu> wrote:
> >
> >> I saw that it was added to their project. I’m really not keen on
> bringing
> >> in all the RDBMS apparatus on top of hbase, so I decided to follow other
> >> avenues first (like trying to patch 0.98, for better or worse.)
> >>
> >> That Phoenix article seems like a good breakdown of the various indexing
> >> architectures.
> >>
> >> HBASE-9203 (the ticket that deals with 2’ indexes) is pretty civilized
> (as
> >> are most of them, it seems) so I didn’t know there were these
> differences
> >> of opinion. Did I miss the mailing list thread where the architectural
> >> differences were discussed?
> >>
> >>
> >> -j
> The opinions expressed here are mine, while they may reflect a cognitive
> thought, that is purely accidental.
> Use at your own risk.
> Michael Segel
> michael_segel (AT) hotmail.com

Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message