hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: Status of Huawei's 2' Indexing?
Date Mon, 16 Mar 2015 18:17:55 GMT
​That's patently untrue and pure paranoia. ​The comment about having a
civilized discussion had nothing to do with you Michael. Joseph said:

"HBASE-9203 (the ticket that deals with 2’ indexes) is pretty civilized (as are
most of them, it seems)"


and so I responded as you saw. I was not thinking of you, I swear I never
think of you unless you write in and call me names. Please let these nice
people get back to the topic at hand.



>
> On 3/16/15, 12:18 PM, "Michael Segel" <michael_segel@hotmail.com> wrote:
>
> >Joseph,
> >
> >The issue with Andrew goes back a few years.  His comment about having a
> >civilized discussion was a personal dig at me.
> >
> >
> >> On Mar 16, 2015, at 10:38 AM, Rose, Joseph
> >><Joseph.Rose@childrens.harvard.edu> wrote:
> >>
> >> Michael,
> >>
> >> I don’t understand the invective. I’m sure you have something to
> >> contribute but when bring on this tone the only thing I hear are the
> >>snide
> >> comments.
> >>
> >>
> >> -j
> >>
> >>
> >> P.s., I’ll refer you to this:
> >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__hbase.apache.org_boo
> >>k.html-23-5Fjoins&d=BQIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&
> >>r=j9wyupjEn0B7jf5KuX71llCBNN37RKmLLRc05fkUwaA79i0DrYaVuQHxlqAccDLc&m=ujJC
> >>fI0GwgZ1Qx9be1fW7FIRqFeS-UmWVS304uhfKLs&s=2TGF0r5VvzExMqV31LmI3rQd4B8eJq_
> >>PqYKJXUqAjNk&e=
> >>
> >>
> >> On 3/16/15, 11:15 AM, "Michael Segel" <michael_segel@hotmail.com>
> wrote:
> >>
> >>> You’ll have to excuse Andy.
> >>>
> >>> He’s a bit slow.  HBASE-13044 should have been done 2 years ago. And it
> >>> was trivial. Just got done last month….
> >>>
> >>> But I digress… The long story short…
> >>>
> >>> HBASE-9203 was brain dead from inception.  Huawei’s idea was to index
> >>>on
> >>> the region which had two problems.
> >>> 1) Complexity in that they wanted to keep the index on the same region
> >>> server
> >>> 2) Joins become impossible.  Well, actually not impossible, but
> >>> incredibly slow when compared to the alternative.
> >>>
> >>> You really should go back to the email chain.
> >>> Their defense (including Salesforce who was going to push this
> >>>approach)
> >>> fell apart when you asked the simple question on how do you handle
> >>>joins?
> >>>
> >>> That’s their OOPS moment. Once you start to understand that, then
> >>> allowing the index to be orthogonal to the base table, things started
> >>>to
> >>> come together.
> >>>
> >>> In short, you have a query either against a single table, or if you’re
> >>> doing a join.  You then get the indexes and assuming that you’re only
> >>> using the AND predicate, its a simple intersection of the index result
> >>> sets. (Since the result sets are ordered, its relatively trivial to
> >>>walk
> >>> through and find the intersections of N Lists in a single pass.)
> >>>
> >>>
> >>> Now you have your result set of base table row keys and you can work
> >>>with
> >>> that data. (Either returning the records to the client, or as input to
> >>>a
> >>> map/reduce job.
> >>>
> >>> That’s the 30K view.  There’s more to it, but once Salesforce got the
> >>> basic idea, they ran with it. It was really that simple concept that
> >>>the
> >>> index would be orthogonal to the base table that got them moving in the
> >>> right direction.
> >>>
> >>>
> >>> To Joseph’s point, indexing isn’t necessarily an RDBMS feature.
> >>>However,
> >>> it seems that some of the Committers are suffering from rectal induced
> >>> hypoxia. HBASE-12853 was created not just to help solve the issue of
> >>>‘hot
> >>> spotting’ but also to get the Committers to focus on bringing the
> >>> solutions that they glum on in the client, back to the server side of
> >>> things.
> >>>
> >>> Unfortunately the last great attempt at fixing things on the server
> >>>side
> >>> was the bastardization of coprocessors which again, suffers from the
> >>>lack
> >>> of thought.  This isn’t to say that allowing users to extend the server
> >>> side functionality is wrong. (Because it isn’t.) But that the
> >>> implementation done in HBase is a tad lacking in thought.
> >>>
> >>> So in terms of indexing…
> >>> Longer term picture, there has to be some fixes on the server side of
> >>> things to allow one to associate an index (allowing for different
> >>>types)
> >>> to a base table, yet the implementation of using the index would end up
> >>> becoming a client.  And by client, it would be an external query engine
> >>> processor that could/should sit on the cluster.
> >>>
> >>> But hey! What do I know?
> >>> I gave up trying to have an intelligent/civilized conversation with
> >>> Andrew because he just couldn’t grasp the basics.  ;-)
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>> On Mar 13, 2015, at 4:14 PM, Andrew Purtell <apurtell@apache.org>
> >>>>wrote:
> >>>>
> >>>> When I made that remark I was thinking of a recent discussion we had
> >>>>at
> >>>> a
> >>>> joint Phoenix and HBase developer meetup. The difference of opinion
> >>>>was
> >>>> certainly civilized. (smile) I'm not aware of any specific written
> >>>> discussion, it may or may not exist. I'm pretty sure a revival of
> >>>> HBASE-9203
> >>>> would attract some controversy, but let me be clearer this time than
I
> >>>> was
> >>>> before that this is just my opinion, FWIW.
> >>>>
> >>>>
> >>>> On Thu, Mar 12, 2015 at 3:58 PM, Rose, Joseph <
> >>>> Joseph.Rose@childrens.harvard.edu> wrote:
> >>>>
> >>>>> I saw that it was added to their project. I’m really not keen
on
> >>>>> bringing
> >>>>> in all the RDBMS apparatus on top of hbase, so I decided to follow
> >>>>> other
> >>>>> avenues first (like trying to patch 0.98, for better or worse.)
> >>>>>
> >>>>> That Phoenix article seems like a good breakdown of the various
> >>>>> indexing
> >>>>> architectures.
> >>>>>
> >>>>> HBASE-9203 (the ticket that deals with 2’ indexes) is pretty
> >>>>>civilized
> >>>>> (as
> >>>>> are most of them, it seems) so I didn’t know there were these
> >>>>> differences
> >>>>> of opinion. Did I miss the mailing list thread where the
> >>>>>architectural
> >>>>> differences were discussed?
> >>>>>
> >>>>>
> >>>>> -j
> >>>
> >>> The opinions expressed here are mine, while they may reflect a
> >>>cognitive
> >>> thought, that is purely accidental.
> >>> Use at your own risk.
> >>> Michael Segel
> >>> michael_segel (AT) hotmail.com
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
> >
> >The opinions expressed here are mine, while they may reflect a cognitive
> >thought, that is purely accidental.
> >Use at your own risk.
> >Michael Segel
> >michael_segel (AT) hotmail.com
> >
> >
> >
> >
> >
>
>


-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message