hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Rawson <ryano...@gmail.com>
Subject Re: about realtime map-reduce
Date Mon, 05 Oct 2009 04:47:25 GMT
I'm not sure that in a controlled environment, arbitrary code would be
all that bad. I guess ddosing your own regionserver would be bad, but
still.

As for real time map reduce, that was a thing on Jonathan's slides,
and he mentioned it was a top secret fancy thing he was working on at
Streamy. No other details are available, unless he chooses to share
them.

-ryan

On Sun, Oct 4, 2009 at 12:06 PM, Andrew Purtell <apurtell@apache.org> wrote:
> On a related note HBASE-1002 talks about generic user filters. But as you point out there
are risks with untrusted code execution which have to be considered even for that restricted
case.
>
> One thing that can be done with some confidence that one user or job won't DoS everyone
else is to allow a fixed set of additive/aggregate function to run in a scanner context on
the regionservers. This would avoid the need to send any of the data back to the client if
the goal is counting, summation, averaging, etc. And these functions can be stacked such that
a list of operations on columns are fed into a list of operations on the row.
>
> Allowing arbitrary code however is the way to madness. There could be an option to allow
this through bytecode shipping but I do not think anyone should fool themselves into thinking
this is at all safe to do in production. There is a middle ground of restricted code (e.g.
no backwards branches or cyclical calling dependencies allowed) which is interesting from
both usability and code safety perspectives. There are some research bytecode rewriting systems
which could serve as a starting point.
>
>   - Andy
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>

Mime
View raw message