hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephane Bailliez <sbaill...@gmail.com>
Subject Re: Running tasks in the TaskTracker VM
Date Tue, 20 Mar 2007 10:19:32 GMT
Torsten Curdt wrote:
>> Executing users' code in system daemons is a security risk.
> 
> Of course there is security benefit in starting the jobs in a different 
> JVM but if you don't trust the code you are executing this is probably 
> not for you either. So bottom line is - if you weight up the performance 
> penalty against the gained security I am still no excited about the JVM 
> spawning idea.
> 
> If you really consider security that big of a problem - come up with 
> your own language to ease and restrict the jobs.

I think security here was more about 'taking down the whole task 
tracker' risk.

Being a complete idiot for distributed computing, I would say it is easy 
to explode a JVM when doing such distributed jobs, (should it be for OOM 
or anything).

If you run within the task tracker vm you'll have to carefully size the 
tracker vm to accommodate potentially the resources of all possibles 
jobs running at the same time or simply allocate a gigantic amount of 
resources 'just in case', which kind of offset the benefits of any 
performance improvement to stability.

Not mentioning cleaning up all the mess left by running jobs including 
flushing the introspection cache to avoid leaks, which will then impact 
performance of other jobs since it is not a selective flush.

Failing jobs are not exactly uncommon and running things in a sandboxed 
environment with less risk for the tracker seems like a perfectly 
reasonable choice. So yeah, vm pooling certainly makes perfect sense for 
it or should probably look at what Doug suggests as well.

My 0.01 kopek ;)

-- stephane


Mime
View raw message