hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Torsten Curdt <tcu...@apache.org>
Subject Re: Running tasks in the TaskTracker VM
Date Tue, 20 Mar 2007 11:29:09 GMT

On 20.03.2007, at 11:19, Stephane Bailliez wrote:

> Torsten Curdt wrote:
>>> Executing users' code in system daemons is a security risk.
>> Of course there is security benefit in starting the jobs in a  
>> different JVM but if you don't trust the code you are executing  
>> this is probably not for you either. So bottom line is - if you  
>> weight up the performance penalty against the gained security I am  
>> still no excited about the JVM spawning idea.
>> If you really consider security that big of a problem - come up  
>> with your own language to ease and restrict the jobs.
>
> I think security here was more about 'taking down the whole task  
> tracker' risk.

Well, the same applies

> Being a complete idiot for distributed computing, I would say it is  
> easy to explode a JVM when doing such distributed jobs, (should it  
> be for OOM or anything).

Then restrict what people can do - at least Google went that route.

> If you run within the task tracker vm you'll have to carefully size  
> the tracker vm to accommodate potentially the resources of all  
> possibles jobs running at the same time or simply allocate a  
> gigantic amount of resources 'just in case', which kind of offset  
> the benefits of any performance improvement to stability.

Question is whether the task tracker should have access to that  
gigantic amount of resources. In one jvm or the other.

> Not mentioning cleaning up all the mess left by running jobs  
> including flushing the introspection cache to avoid leaks, which  
> will then impact performance of other jobs since it is not a  
> selective flush.
>
> Failing jobs are not exactly uncommon and running things in a  
> sandboxed environment with less risk for the tracker seems like a  
> perfectly reasonable choice. So yeah, vm pooling certainly makes  
> perfect sense for it

I am still not convinced - sorry

It's a bit like you would like to run JSPs in a separate JVM because  
they might take down the servlet container.

cheers
--
Torsten

Mime
View raw message