Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 38425 invoked from network); 30 Jun 2010 11:00:28 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 30 Jun 2010 11:00:28 -0000 Received: (qmail 93234 invoked by uid 500); 30 Jun 2010 11:00:26 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 92706 invoked by uid 500); 30 Jun 2010 11:00:22 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 92698 invoked by uid 99); 30 Jun 2010 11:00:22 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Jun 2010 11:00:22 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of umka@stanford.edu designates 171.67.219.89 as permitted sender) Received: from [171.67.219.89] (HELO smtp-roam.stanford.edu) (171.67.219.89) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Jun 2010 11:00:13 +0000 Received: from smtp-roam.stanford.edu (localhost.localdomain [127.0.0.1]) by localhost (Postfix) with SMTP id C463D77949 for ; Wed, 30 Jun 2010 03:59:51 -0700 (PDT) Received: from dmitryPC (DN800ce30d.Stanford.EDU [128.12.227.13]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: umka) by smtp-roam.stanford.edu (Postfix) with ESMTPSA id 4AC5477942 for ; Wed, 30 Jun 2010 03:59:51 -0700 (PDT) From: "Dmitry Pushkarev" To: References: In-Reply-To: Subject: Hadoop and SGE Date: Wed, 30 Jun 2010 03:59:48 -0700 Message-ID: <000301cb1843$5b62ace0$122806a0$@edu> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 12.0 Thread-Index: AcsYPv/ebPatGu5RR6Ozu+MgO3F7mAAAs8mw Content-Language: en-us x-cr-hashedpuzzle: AagG AmcD BWcD Bz9X Cd3o ChoT DKu8 Do9A DpkA DxcY D6/D FzJT GClj Hueh IXY+ K6Mv;1;YwBvAG0AbQBvAG4ALQB1AHMAZQByAEAAaABhAGQAbwBvAHAALgBhAHAAYQBjAGgAZQAuAG8AcgBnAA==;Sosha1_v1;7;{3A2D72B2-1A3A-4B74-B709-232890715029};dQBtAGsAYQBAAHMAdABhAG4AZgBvAHIAZAAuAGUAZAB1AA==;Wed, 30 Jun 2010 10:59:44 GMT;SABhAGQAbwBvAHAAIABhAG4AZAAgAFMARwBFAA== x-cr-puzzleid: {3A2D72B2-1A3A-4B74-B709-232890715029} X-Virus-Checked: Checked by ClamAV on apache.org Dear Hadoop users, I'm in the process of building a new cluster for our lab and I'm trying to run SGE simultaneously with hadoop. Idea is that each node would function as datanode at all times, but depending on situation and a fraction of nodes will run SGE instead of plain. SGE jobs will not have access to HDFS or local filesystem (except for /tmp) and will run out of external NAS, they aren't supposed to be IO bound. I'm trying to figure out of what's the best way to setup this resource sharing. One way would be to shutdown tasktrackers on reserved nodes and add them to SGE pool. Another way is run tasktrackers as SGE jobs and each tasktracker would shut down after some idle time. Has anyone tried something like this? I'd appreciate any advice. Thanks.