hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitry Pushkarev" <u...@stanford.edu>
Subject Hadoop and SGE
Date Wed, 30 Jun 2010 10:59:48 GMT
Dear Hadoop users,

I'm in the process of building a new cluster for our lab and I'm trying to
run SGE simultaneously with hadoop. Idea is that each node would function as
datanode at all times, but depending on situation and a fraction of nodes
will run SGE instead of plain. SGE jobs will not have access to HDFS or
local filesystem (except for /tmp) and will run out of external NAS, they
aren't supposed to be IO bound.  

I'm trying to figure out of what's the best way to setup this resource
sharing. One way would be to shutdown tasktrackers on reserved nodes and add
them to SGE pool. Another way is run tasktrackers as SGE jobs and each
tasktracker would shut down after some idle time. 

Has anyone tried something like this? I'd appreciate any advice.


View raw message