hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Segel (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4047) [Coprocessors] Generic external process host
Date Fri, 10 Apr 2015 12:34:12 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14489529#comment-14489529
] 

Michael Segel  commented on HBASE-4047:
---------------------------------------

Sorry I don't always follow Jiras. 

To answer your question... in terms of patches, it would be a massive rewrite and would probably
break the existing code base using coprocessors today. 
In terms of me providing a patch... will Apache indemnify me if I get sued for introducing
IP that I may have used or learned at a former company / client? 
(Didn't think so.)  

I can tell you what you need and I can pencil out a design.  But that's as far as I can go.


In terms of a strong requirement. By creating a flag that will stop the loading of coprocessor
code after the system coprocessors are loaded, the security issue is reduced to a point that
the requirement goes away.  There is a large enough client who could make that request from
one of the vendors, however they are not using HBase at a level where they are implementing
coprocessors. 

Outside of a requirement. The issue is that using coprocessors adds risk to the system. Risk
in terms of performance, stability, and security.  It also causes issues when it comes to
maintenance.  You want to remove (not shut off) a coprocessor you can't without restarting
the RS and reloading the coprocessors that you want loaded. (e.g. class collision) 

Coprocessors is necessary for extending HBase beyond a simple object store.  Security (XASecure
/ Ranger) require it.  Adding OLTP and RDBMs like features are also important to many.  (Transactions
/ Isolation levels) Fixing issues with compactions... 

But I digress.   

> [Coprocessors] Generic external process host
> --------------------------------------------
>
>                 Key: HBASE-4047
>                 URL: https://issues.apache.org/jira/browse/HBASE-4047
>             Project: HBase
>          Issue Type: New Feature
>          Components: Coprocessors
>            Reporter: Andrew Purtell
>
> Where HBase coprocessors deviate substantially from the design (as I understand it) of
Google's BigTable coprocessors is we've reimagined it as a framework for internal extension.
In contrast BigTable coprocessors run as separate processes colocated with tablet servers.
The essential trade off is between performance, flexibility and possibility, and the ability
to control and enforce resource usage.
> Since the initial design of HBase coprocessors some additional considerations are in
play:
> - Developing computational frameworks sitting directly on top of HBase hosted in coprocessor(s);
> - Introduction of the map reduce next generation (mrng) resource management model, and
the probability that limits will be enforced via cgroups at the OS level after this is generally
available, e.g. when RHEL 6 deployments are common;
> - The possibility of deployment of HBase onto mrng-enabled Hadoop clusters via the mrng
resource manager and a HBase-specific application controller.
> Therefore we should consider developing a coprocessor that is a generic host for another
coprocessor, but one that forks a child process, loads the target coprocessor into the child,
establishes a bidirectional pipe and uses an eventing model and umbilical protocol to provide
for the coprocessor loaded into the child the same semantics as if it was loaded internally
to the parent, and (eventually) use available resource management capabilities on the platform
-- perhaps via the mrng resource controller or directly with cgroups -- to limit the child
as desired by system administrators or the application designer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message