hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: On coprocessor API evolution
Date Sat, 17 May 2014 12:39:26 GMT
You have to understand…

I do see the importance of the hook to allow for a trigger to implement 3rd party code on
the server side. 
No argument there. 

Its just how the current implementation doesn’t sandbox the code so that it limits the potential
for harm to the RS.

In simple terms you can isolate the code in to a separate jvm and use IPC to connect the sandbox
to the RS when a trigger occurs.

In C/C++ you’d have shared memory segments, something you don’t really have in Java. 
(You could use C and then put a JNI wrapper around this…) 

Which goes to my point… this is something that is solvable. You just need to think about

You talk about RDBMSs. Triggers themselves are not an equivalent analogy. You can have a trigger
that then calls some code written in an SPL and you’re ok. You can control the SPL environment
so that you limit the risk of the server crashing. 
(SPL == Stored Procedure Language) 

If you’re running  third party code from your trigger that is written in C/C++ or Java,
then you have other issues. 

Sybase’s Adaptive Server had some serious issues and a poorly written C/C++ code could cause
serious performance issues… Informix IDS took a different approach and didn’t have those
issues.  And I’m aging myself because most here probably never worked with either Sybase
or Informix … ;-)

So using your RDBMS analogy… you have two different approaches. One worked … well enough,
but was problematic.  The other worked better and had less issues and was more secure. 

One of the reasons why this is important… the longer the current implementation is in the
wild, the longer and harder it will take to fix. 

On May 17, 2014, at 11:44 AM, qiang tian <tianq01@gmail.com> wrote:

> My small 2 cents...:-)
> Hook/coprocessor is useful mechanism to interacting with a system for
> things that cannot be done via API.  For end user, the tradeoff  factors
> like performance, security, reliability etc can be control by upper layer'
> policy.
> e.g. In RDBMS, the end user has limited usage case for triggers, which
> eliminates the security factor at all, and the performance tradeoff is
> given to end user to decide. so from evolution's perspective,
> hook/coprocessor for end user could be controlled by query engine layer
> like Phoenix.
> For internal user, hook better not be used widely unless it is a MUST or
> strong flexibility/plugability is required.  e.g. things can be part of the
> core better not use it.
> thanks.
> On Sat, May 17, 2014 at 4:04 PM, Michael Segel <michael_segel@hotmail.com>wrote:
>> Andrew,
>> Is ‘magical fairy dust’ a reference to some new synthetic drug you take at
>> raves?
>> But lets get back to reality.
>> Lets try this again; simply put… the coprocessor runs on the same JVM as
>> the RS, therefore you have an unacceptable level of risk.
>> That inherent risk means that you cannot run HBase with end-user
>> coprocessors enabled when you want to have a stable and somewhat secure
>> environment.
>> The simple truth is that you need to decouple the end-user code
>> (coprocessor) from the RS.
>> Its not a difficult concept to understand, and while reasonable, it would
>> mean a major rewrite and work done on co-processors.
>> Will de-coupling the user-space from the RS remove all risk? No.  And no,
>> I’m not suggesting that.
>> But its a critical piece to the puzzle.
>> Its not just security, but also reliability.
>> On May 17, 2014, at 4:43 AM, Andrew Purtell <apurtell@apache.org> wrote:
>>> Michael,
>>> As you know, we have implemented security features with coprocessors
>>> precisely because they can be interposed on internal actions to make
>>> authoritative decisions in-process. Coprocessors are a way to have
>>> composable internal extensions. They don't have and probably never will
>>> have magic fairy security dust. We do trust the security coprocessor code
>>> because it was developed by the project. That is not the same thing as
>>> saying you can have 'security' and execute arbitrary user code in-process
>>> as a coprocessor. Just want to clear that up for you.
>>>> will want to allow system coprocessors but then write a coprocessor that
>>> reject user coprocessors.
>>> That's a reasonable point.
>>> On Sat, May 17, 2014 at 12:13 AM, Michael Segel
>>> <michael_segel@hotmail.com>wrote:
>>>> Until you move the coprocessor out of the RS space and into its own
>>>> sandbox… saying security and coprocessor in the same sentence is a joke.
>>>> Oh wait… you were serious… :-(
>>>> I’d say there’s a significant rethink on coprocessors that’s required.
>>>> Anyone running a secure (kerberos) cluster, will want to allow system
>>>> coprocessors but then write a coprocessor that reject user coprocessors.
>>>> Just putting it out there…
>>>> On May 15, 2014, at 2:13 AM, Andrew Purtell <apurtell@apache.org>
>> wrote:
>>>>> Because coprocessor APIs are so tightly bound with internals, if we
>> apply
>>>>> suggested rules like as mentioned on HBASE-11054:
>>>>>    I'd say policy should be no changes to method apis across minor
>>>>> versions
>>>>> This will lock coprocessor based components to the limitations of the
>> API
>>>>> as we encounter them. Core code does not suffer this limitation, we are
>>>>> otherwise free to refactor and change internal methods. For example,
>>>> we
>>>>> apply this policy to the 0.98 branch, then we will have to abandon
>>>> further
>>>>> security feature development there and move to trunk only. This is
>>>> because
>>>>> we already are aware that coprocessor APIs as they stand are
>> insufficient
>>>>> still.
>>>>> Coprocessor APIs are a special class of internal method. We have had
>>>>> tension between allowing freedom of movement for developing them out
>> and
>>>>> providing some measure of stability for implementors for a while.
>>>>> It is my belief that the way forward is something like HBASE-11125.
>>>> Perhaps
>>>>> we can take this discussion to that JIRA and have this long overdue
>>>>> conversation.
>>>>> Regarding security features specifically, I would also like to call
>> your
>>>>> attention to HBASE-11127. I think security has been an optional feature
>>>>> long enough, it is becoming a core requirement for the project, so
>> should
>>>>> be moved into core. Sure, we can therefore sidestep any issues with
>>>>> coprocessor API sufficiency for hosting security features. However, in
>> my
>>>>> opinion we should pursue both HBASE-11125 and HBASE-11127; the first
>>>>> provide the relative stability long asked for by coprocessor API users,
>>>> the
>>>>> latter to cleanly solve emerging issues with concurrency and
>> versioning.
>>>>> --
>>>>> Best regards,
>>>>> - Andy
>>>>> Problems worthy of attack prove their worth by hitting back. - Piet
>> Hein
>>>>> (via Tom White)
>>> --
>>> Best regards,
>>>  - Andy
>>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>>> (via Tom White)

View raw message