hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: about realtime map-reduce
Date Sun, 04 Oct 2009 19:06:07 GMT
On a related note HBASE-1002 talks about generic user filters. But as you point out there are
risks with untrusted code execution which have to be considered even for that restricted case.

One thing that can be done with some confidence that one user or job won't DoS everyone else
is to allow a fixed set of additive/aggregate function to run in a scanner context on the
regionservers. This would avoid the need to send any of the data back to the client if the
goal is counting, summation, averaging, etc. And these functions can be stacked such that
a list of operations on columns are fed into a list of operations on the row.

Allowing arbitrary code however is the way to madness. There could be an option to allow this
through bytecode shipping but I do not think anyone should fool themselves into thinking this
is at all safe to do in production. There is a middle ground of restricted code (e.g. no backwards
branches or cyclical calling dependencies allowed) which is interesting from both usability
and code safety perspectives. There are some research bytecode rewriting systems which could
serve as a starting point. 

   - Andy

Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 

View raw message