hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: On coprocessor API evolution
Date Mon, 19 May 2014 11:18:55 GMT

I suggest that you go back and re-read Kevin O’Dell’s comment. 
Clearly there’s no straw man here.  

I thought your response was a joke, but apparently you’re serious about that. 

Again, when you allow user code to run in the same JVM as the RS, you add risk. Issues with
stability and security need to be addressed. 
You post an earlier Jira where even you see this as an issue.  (That’s the biggest irony.)

If you and the other committers can’t recognize this as an issue… then there is a more
serious problem that needs to be addressed. 

On May 19, 2014, at 11:24 AM, Andrew Purtell <andrew.purtell@gmail.com> wrote:

> Yes it's clear you have been laughing at many of the responses and have been having a
merry time concern trolling your favorite straw man. You insist on fixating on analogies we
have used to explain coprocessors to advanced users and spinning that into an absolutist (and
ridiculous) position that doesn't match up with what everyone else is telling you. At some
point you need to stop talking and start listening. The insults are not helping your case.
Or maybe that's the point. Yes Michael you are the RDBMS king and we are all idiots. Happy?
Now consider this: That means nothing as HBase isn't even in that space. Maybe you should
try this over on the MySQL or Postgres mailing lists. 
>> On May 19, 2014, at 12:06 AM, Michael Segel <michael_segel@hotmail.com> wrote:
>> Remind me how you do server side execution? 
>> I suggest you go through the emails on both the dev and user hbase list and you’ll
see that when describing coprocessors, you’ll see the terms ‘trigger’ and ‘stored
procedures’ which are terms most DBAs and Data modelers are familiar with. 
>> And if you read my email earlier in the thread, I said that the analogy fell a little
short because of the extensibility aspect.
>> I have to ask, how many of the committers and readers on this thread have actual
RDBMs experience that goes beyond mySQL, and maybe Postgres.  I mentioned Sybase’s Adaptive
Server as well as Informix’s IDS.  Anyone know what I’m talking about? How about Oracle?
DB2?  I’ll even add Dick Pick’s Revelation or U2 (Universe) which are hierarchal systems.

>> The fact that you said ‘it ain’t broken’ means that you just don’t understand
the problem. Even after Kevin O’Dell admitted that his (Cloudera’s) customers who are
using coprocessors are complaining of HBase not behaving in a stable fashion. It may be their
code and lack of understanding, but Cloudera still has to address it and you can’t say using
coprocessors voids the warranty. (Sorry Andrew, I had to laugh at that one…)
>> I understand completely what coprocessors were designed for.  What you don’t seem
to understand is that the design and implementation which falls short and open it up to potential
reliability issues. 
>> To be clear… I haven’t even talked about APIs, I’m talking about the core design.

>> To the best of my knowledge,  MapR’s M7 doesn’t have coprocessors. I’ll wager
that when they do, it will work and not have these issues. I believe that they are writing
their stuff in C/C++, if so, then they’d have an advantage of using shared memory.  Apache
would have write C/C++ code and wrap it in JNI… which you may not want to do… 
>> But getting back to the issue at hand… if HBase is viewed as hard to tune and hard
to keep up… people are going to look towards other solutions. 
>> I believe you can already disable the ability to create enduser coprocessors. When
you load the last system coprocessor, you load one that looks to see who’s trying to add
a coprocessor and you just deny it. If you wanted to make a formal change, then you would
just have a database permission that you either GRANT or REVOKE the ability of a user the
privilege  to add coprocessors. But that would mean more work for someone. 
>> -Mike
>>> On May 18, 2014, at 9:58 PM, lars hofhansl <larsh@apache.org> wrote:
>>> Coprocessors are a means to extend HBase. Nothing more, nothing less. They are
not stored procedures or triggers.
>>> Not sure in how many other ways we can/need to phrase that.
>>> I agree that there should be a simple way to disable user coprocessors (or at
least disable loading from HDFS) for the security conscious. Let's do that, it's simple.
>>> There is nothing to "fix" since it ain't broken. It's only seems broken when
you do not understand what it was designed for.
>>> You want a new API for less invasive things in a sandbox, more like stored procedures
and triggers... Sure, let's do that too. But realize that is a *new* use case, and that we'll
keep the old stuff.
>>> -- Lars
>>> ________________________________
>>> From: Michael Segel <michael_segel@hotmail.com>
>>> To: dev@hbase.apache.org; lars hofhansl <larsh@apache.org> 
>>> Sent: Sunday, May 18, 2014 10:21 AM
>>> Subject: Re: On coprocessor API evolution
>>> It doesn’t matter. 
>>> Sure we can follow Vlad’s rules… but you still have to get to the root of
the problem and that is making coprocessors safe. 
>>> Its not an easy fix, and it would mean pretty much starting from scratch. Trying
to kludge a fix is harder and will not be as good. 
>>> Maybe you can salvage some code, but the issue is fixing coprocessors at the
lowest level and work back up. 
>>> You have to isolate the code to one or more separate jvms so you can not only
stop, but reload the processes. 
>>> This is more than just simple triggers but also extensibility. 
>>> If you could pick the brains of some of the folks still under Kevin Foster (@IBM)
who work on IDS… you could get some ideas. 
>>>> On May 18, 2014, at 7:01 AM, lars hofhansl <larsh@apache.org> wrote:
>>>> We've seen similar issues with Filters. Those are good rules to follow.
>>>> ________________________________
>>>> From: Vladimir Rodionov <vladrodionov@gmail.com>
>>>> To: "dev@hbase.apache.org" <dev@hbase.apache.org> 
>>>> Sent: Friday, May 16, 2014 10:59 AM
>>>> Subject: Re: On coprocessor API evolution
>>>> 1) Have default implementations (abstract classes) for every interface from
>>>> Coprocessor API.
>>>> 2) Advise coprocessor users not to implement interface directly but sub
>>>> class default impl.
>>>> 3) Preserve backward compatibility by adding only new hooks/methods
>>>> 4) DO NOT CHANGE existing API (no method renaming, method parameter type
>>>> changes etc)
>>>> 5) Have a regression tests to check backward compatibility.
>>>> -Vladimir
>>>> On Fri, May 16, 2014 at 9:13 AM, Michael Segel <michael_segel@hotmail.com>wrote:
>>>>> Until you move the coprocessor out of the RS space and into its own
>>>>> sandbox… saying security and coprocessor in the same sentence is a
>>>>> Oh wait… you were serious… :-(
>>>>> I’d say there’s a significant rethink on coprocessors that’s required.
>>>>> Anyone running a secure (kerberos) cluster, will want to allow system
>>>>> coprocessors but then write a coprocessor that reject user coprocessors.
>>>>> Just putting it out there…
>>>>>> On May 15, 2014, at 2:13 AM, Andrew Purtell <apurtell@apache.org>
>>>>>> Because coprocessor APIs are so tightly bound with internals, if
we apply
>>>>>> suggested rules like as mentioned on HBASE-11054:
>>>>>>      I'd say policy should be no changes to method apis across minor
>>>>>> versions
>>>>>> This will lock coprocessor based components to the limitations of
the API
>>>>>> as we encounter them. Core code does not suffer this limitation,
we are
>>>>>> otherwise free to refactor and change internal methods. For example,
>>>>> we
>>>>>> apply this policy to the 0.98 branch, then we will have to abandon
>>>>> further
>>>>>> security feature development there and move to trunk only. This is
>>>>> because
>>>>>> we already are aware that coprocessor APIs as they stand are insufficient
>>>>>> still.
>>>>>> Coprocessor APIs are a special class of internal method. We have
had a
>>>>>> tension between allowing freedom of movement for developing them
out and
>>>>>> providing some measure of stability for implementors for a while.
>>>>>> It is my belief that the way forward is something like HBASE-11125.
>>>>> Perhaps
>>>>>> we can take this discussion to that JIRA and have this long overdue
>>>>>> conversation.
>>>>>> Regarding security features specifically, I would also like to call
>>>>>> attention to HBASE-11127. I think security has been an optional feature
>>>>>> long enough, it is becoming a core requirement for the project, so
>>>>>> be moved into core. Sure, we can therefore sidestep any issues with
>>>>>> coprocessor API sufficiency for hosting security features. However,
in my
>>>>>> opinion we should pursue both HBASE-11125 and HBASE-11127; the first
>>>>>> provide the relative stability long asked for by coprocessor API
>>>>> the
>>>>>> latter to cleanly solve emerging issues with concurrency and versioning.
>>>>>> --
>>>>>> Best regards,
>>>>>>   - Andy
>>>>>> Problems worthy of attack prove their worth by hitting back. - Piet
>>>>>> (via Tom White)

View raw message