Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C2910E543 for ; Thu, 17 Jan 2013 13:19:19 +0000 (UTC) Received: (qmail 87401 invoked by uid 500); 17 Jan 2013 13:19:14 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 87161 invoked by uid 500); 17 Jan 2013 13:19:13 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 87135 invoked by uid 99); 17 Jan 2013 13:19:12 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 Jan 2013 13:19:12 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of roberto.nunnari@supsi.ch designates 195.176.176.171 as permitted sender) Received: from [195.176.176.171] (HELO ti-edu.ch) (195.176.176.171) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 Jan 2013 13:19:03 +0000 X-Virus-Scanned: by cgpav Received: from [193.5.153.20] (account roberto.nunnari@supsi.ch [193.5.153.20] verified) by ti-edu.ch (CommuniGate Pro SMTP 5.4.8) with ESMTPSA id 75900441 for user@hadoop.apache.org; Thu, 17 Jan 2013 14:18:43 +0100 Message-ID: <50F7FA31.4070303@supsi.ch> Date: Thu, 17 Jan 2013 14:18:41 +0100 From: Roberto Nunnari User-Agent: Thunderbird 2.0.0.24 (Windows/20100228) MIME-Version: 1.0 To: user@hadoop.apache.org Subject: Re: building a department GPU cluster References: <50F7B535.5030107@supsi.ch> In-Reply-To: <50F7B535.5030107@supsi.ch> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Roberto Nunnari wrote: > Hi all. > > I'm writing to you to ask for advice or a hint to the right direction. > > In our department, more and more researchers ask us (IT administrators) > to assemble (or to buy) GPGPU powered workstations to do parallel > computing. > > As I already manage a small CPU cluster (resources managed using SGE), > with my boss we talked about building a new GPU cluster. The problem is > that I have no experience at all with GPU clusters. > > Apart from the already running GPU workstations, we already have some > new HW that looks promising to me as a starting point for a GPU cluster. > > - 1x Dell PowerEdge R720 > - 1x Dell PowerEdge C410x > - 1x NVIDIA M2090 PCIe x16 > - 1x NVIDIA iPASS Cable Kit > (Dell forgot to include the iPASS adapter for the R720!! :-D) > > I'd be grateful if you could kindly give me some advice and/or hint to > the right direction. > > In particular I'm interested on your opinion on: > 1) is the above HW suitable for a small (2 to 4/6 GPUs) GPU cluster? > 2) is apache adhoop suitable (or what could we use?) as a queuing and > resource management system? We would like the cluster to be usable by > many users at once in a way that no user has to worry about resources, > just like we do on the CPU cluster with SGE. > 3) What distribution of linux would be more appropriate? > 4) necessary stack of sw? (cuda, hadoop, other?) > > Thank you very much for your valuable insight! > > Best regards. > Robi Anybody on this, please? Robi