Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 4943 invoked from network); 19 Oct 2010 11:48:11 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 19 Oct 2010 11:48:11 -0000 Received: (qmail 9943 invoked by uid 500); 19 Oct 2010 11:48:09 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 9619 invoked by uid 500); 19 Oct 2010 11:48:05 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 9603 invoked by uid 99); 19 Oct 2010 11:48:04 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 19 Oct 2010 11:48:04 +0000 X-ASF-Spam-Status: No, hits=4.4 required=10.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of tmatthewjohn1988@gmail.com designates 209.85.212.48 as permitted sender) Received: from [209.85.212.48] (HELO mail-vw0-f48.google.com) (209.85.212.48) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 19 Oct 2010 11:47:56 +0000 Received: by vws10 with SMTP id 10so752904vws.35 for ; Tue, 19 Oct 2010 04:47:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=RjAdWz7MV88S2+l9qxZI2Tx246GCnfypWGO4naXGWEs=; b=bhjiG1Z90JqaPaTgq8FZEueeFZXENytilO2FtemH9cbWiEr7fVVglR9ViiOSOBp4Dw VPCes9EiT7xlbtBmq6iLuJGDMsbaYy73GGL6DhmhZbMWx7/LtNr4nvkal0i4cohzSmNy GYIIwzk4alVmYjNl7rK/SkXuwKAX6KZwPiR0g= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=cPIl4RBlCd3Y8XwCVfiD3C8snnI3UsH5w3ToQxIoKueVB95+sNTuUY21s05eo9brDP ts/4wzZ4bX9QOlUHvKLTfr2BYMlq/DiKKikYyuo31gpWlvgxiIV7+T/W0qbkMie4sBNM 9nzjLDOI/sN9+kwGyWGfd+8n9FVTu73VFYWxg= MIME-Version: 1.0 Received: by 10.224.47.66 with SMTP id m2mr350516qaf.95.1287488855802; Tue, 19 Oct 2010 04:47:35 -0700 (PDT) Received: by 10.229.27.85 with HTTP; Tue, 19 Oct 2010 04:47:35 -0700 (PDT) Date: Tue, 19 Oct 2010 17:17:35 +0530 Message-ID: Subject: Reduce groups From: Matthew John To: common-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=00151750e900e70bc20492f6db6f X-Virus-Checked: Checked by ClamAV on apache.org --00151750e900e70bc20492f6db6f Content-Type: text/plain; charset=ISO-8859-1 Hi all, The number of Reducer groups in my MapReduce is always the same as the number of records output by the MapReduce. So what I understand is every record from the Shuffle/Sort is going to different Reducer.reduce. How can I change this? My key is BytesWritable and I tried writing my own Comparator and set it in setOutputValueGroupingClass but still not more than one record is entering the same reduce group. Someone please tell me the mechanism behind this so that I can fix this problem . I am not caring about Partitioner since I am using a single reducer. Thanks, Matthew --00151750e900e70bc20492f6db6f--