Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DA4ACD867 for ; Tue, 14 Aug 2012 11:56:54 +0000 (UTC) Received: (qmail 47625 invoked by uid 500); 14 Aug 2012 11:56:50 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 47357 invoked by uid 500); 14 Aug 2012 11:56:50 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 47336 invoked by uid 99); 14 Aug 2012 11:56:49 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Aug 2012 11:56:49 +0000 X-ASF-Spam-Status: No, hits=-0.5 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FSL_RCVD_USER,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of nitinpawar432@gmail.com designates 209.85.215.48 as permitted sender) Received: from [209.85.215.48] (HELO mail-lpp01m010-f48.google.com) (209.85.215.48) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Aug 2012 11:56:43 +0000 Received: by lagr15 with SMTP id r15so158747lag.35 for ; Tue, 14 Aug 2012 04:56:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=Sp70dQiy17pJNrzEO6cA/wxgI5omnwZUsL4uhlFu4D8=; b=BxDw6wFVm1vXi0NDSqI7BFmCyiUpVxbAhiV3zcYQBnSsXYUGdklGRUFCaLeJtkPQ6E GIbFuvQJDYQkwuVC2dRLFjXY2hkQ78Tx5AzP9GDeW2M8NIdwgxkpy9DZYoExI8qb1uzD bGlg37onHycmNPLUB4dwgwQ8Dixw6dNFuDqpA68lj96bKmRJfV/F+DUhS0jemBJHDT2N M1tvZDP4VwBhKViBnesUxG7zQ5tnRmSAKBemSHvn1nsQeCxa1Am3WO4RxDdKehWF0wbH lpA72zdTx/+BqjZ6yGF8szRsUr+V3l6pyDC3Jud+5pvb8Uez4nml8KEKv99xp/pDuE+V FjKg== MIME-Version: 1.0 Received: by 10.152.102.234 with SMTP id fr10mr12306726lab.32.1344945382947; Tue, 14 Aug 2012 04:56:22 -0700 (PDT) Received: by 10.112.127.39 with HTTP; Tue, 14 Aug 2012 04:56:22 -0700 (PDT) In-Reply-To: <7AFB227D47B00A49A4C9E00FF82685B42E90E4@sara-exch-2.ka.sara.nl> References: <7AFB227D47B00A49A4C9E00FF82685B42E8E01@sara-exch-2.ka.sara.nl> <7AFB227D47B00A49A4C9E00FF82685B42E90E4@sara-exch-2.ka.sara.nl> Date: Tue, 14 Aug 2012 17:26:22 +0530 Message-ID: Subject: Re: Pending reducers From: Nitin Pawar To: user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 whats the memory/cpu stats on the machines ? are they exhausted On Tue, Aug 14, 2012 at 5:20 PM, Evert Lammerts wrote: >> reducers of multiple jobs do run con-currently as long as they have the >> resources available. > > Yep, and that's what's not happening in my situation. 528 reduce slots, 400 taken by one job, 26 of another job remain in pending state. What could explain this behavior? > > Evert > >> >> If you want to limit someone overtaking the cluster, then you can >> create different job queues and assign quota to each queue. You also >> have the flexibility of allocating max quota per user in a queue as >> well. >> >> >> >> On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts >> wrote: >> > Hi list, >> > >> > I have a cluster running Hadoop 0.20.205 with Kerberos enabled, >> exposing 528 map slots and 528 reduce slots. Currently somebody is >> running a NORMAL priority job with 7 mappers and 400 reducers. The >> mappers have finished and the system is processing the reducers. >> Another user is running a NORMAL priority job with 1 mapper and 26 >> reducers. The mapper has finished, but the reducers won't come out of >> "pending" state. There are no other jobs running right now. We've not >> yet installed a different scheduler, so right now the system is using >> the default scheduler. How can this behavior be explained? I see >> mappers of multiple jobs run concurrently, and I *thought* I've seen >> reducers of multiple jobs run concurrently, but I'm not completely >> sure. Any idea? >> > >> > Evert >> >> >> >> -- >> Nitin Pawar -- Nitin Pawar