Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 214229347 for ; Mon, 14 May 2012 17:27:04 +0000 (UTC) Received: (qmail 80136 invoked by uid 500); 14 May 2012 17:27:00 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 80043 invoked by uid 500); 14 May 2012 17:27:00 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Delivered-To: moderator for common-user@hadoop.apache.org Received: (qmail 14342 invoked by uid 99); 14 May 2012 05:49:22 -0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of anwardshaikh@gmail.com designates 209.85.213.48 as permitted sender) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=67omrtLni4+NLuz5maQKrMtvzcjc4uCMpY0kDcNSGK4=; b=qQCVyKHppfwAXrGMQTWXkHgis0/be02yR95YgGiz7rTosvmCgvZ30a/59Wum0mjfKY 3lupqw9g3r4ocA16+WJgU5ORXy9/YZtlAtlbYAr2ehQzCXIhSTft3Uf9/S+qD+HcDHYw 1mmgZiL0ZFhM4DGxT5dnQw/JdntbRasmPZ4Pkssab2UFjQofYAMN3JWJtx6Xzo5vPCh8 N21c2V9xHHwNBBWBF0ft0+mrdK9X2F+B8/HzE69L8m8oN1P9oXa7P4SDq4s/bQl76RAR KoOH8nmj6ZBkx2uUcQA9/v3NLlnkXPTuU9dM/j5BeLWvvdBe0JZqMGi1e9qYuY6VSpjw ughg== MIME-Version: 1.0 From: anwar shaikh Date: Mon, 14 May 2012 11:18:13 +0530 Message-ID: Subject: Number of Reduce Tasks To: common-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=20cf303bf65c26857104bff8a423 --20cf303bf65c26857104bff8a423 Content-Type: text/plain; charset=ISO-8859-1 Hi Everybody, I am executing a MapReduce job to execute JOIN operation using org.apache.hadoop.contrib.utils.join Four files are given as Input. I think there are four Map Jobs running (based on the line marked in red ). I have also set number of reducers to be 10 using - * job.setNumReduceTasks(10) * * * But, only one reduce task is performed (line marked in blue). So, Please can you suggest how can I increase the number of reducers ? Below are some of the last lines from the log. ----------------------------------------------------------------------------------------------------------------------------------------------------- 12/05/14 10:32:46 INFO mapred.Task: Task '*attempt_local_0001_m_000003_0*' done. 12/05/14 10:32:46 INFO mapred.LocalJobRunner: 12/05/14 10:32:46 INFO mapred.Merger: Merging 4 sorted segments 12/05/14 10:32:46 INFO mapred.Merger: Down to the last merge-pass, with 4 segments left of total size: 8018 bytes 12/05/14 10:32:46 INFO mapred.LocalJobRunner: 12/05/14 10:32:46 INFO datajoin.job: key: 1 this.largestNumOfValues: 48 12/05/14 10:32:46 INFO mapred.Task: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting 12/05/14 10:32:46 INFO mapred.LocalJobRunner: 12/05/14 10:32:46 INFO mapred.Task: Task attempt_local_0001_r_000000_0 is allowed to commit now 12/05/14 10:32:46 INFO mapred.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_000000_0' to file:/home/anwar/workspace/JoinLZOPfiles/OutLarge 12/05/14 10:32:49 INFO mapred.LocalJobRunner: actuallyCollectedCount 86 collectedCount 86 groupCount 25 > reduce 12/05/14 10:32:49 INFO mapred.Task: Task '*attempt_local_0001_r_000000_0'*done. 12/05/14 10:32:50 INFO mapred.JobClient: map 100% reduce 100% 12/05/14 10:32:50 INFO mapred.JobClient: Job complete: job_local_0001 12/05/14 10:32:50 INFO mapred.JobClient: Counters: 17 12/05/14 10:32:50 INFO mapred.JobClient: File Input Format Counters 12/05/14 10:32:50 INFO mapred.JobClient: Bytes Read=1666 12/05/14 10:32:50 INFO mapred.JobClient: File Output Format Counters 12/05/14 10:32:50 INFO mapred.JobClient: Bytes Written=2421 12/05/14 10:32:50 INFO mapred.JobClient: FileSystemCounters 12/05/14 10:32:50 INFO mapred.JobClient: FILE_BYTES_READ=22890 12/05/14 10:32:50 INFO mapred.JobClient: FILE_BYTES_WRITTEN=194702 12/05/14 10:32:50 INFO mapred.JobClient: Map-Reduce Framework 12/05/14 10:32:50 INFO mapred.JobClient: Map output materialized bytes=8034 12/05/14 10:32:50 INFO mapred.JobClient: Map input records=106 12/05/14 10:32:50 INFO mapred.JobClient: Reduce shuffle bytes=0 12/05/14 10:32:50 INFO mapred.JobClient: Spilled Records=212 12/05/14 10:32:50 INFO mapred.JobClient: Map output bytes=7798 12/05/14 10:32:50 INFO mapred.JobClient: Map input bytes=1666 12/05/14 10:32:50 INFO mapred.JobClient: SPLIT_RAW_BYTES=472 12/05/14 10:32:50 INFO mapred.JobClient: Combine input records=0 12/05/14 10:32:50 INFO mapred.JobClient: Reduce input records=106 12/05/14 10:32:50 INFO mapred.JobClient: Reduce input groups=25 12/05/14 10:32:50 INFO mapred.JobClient: Combine output records=0 12/05/14 10:32:50 INFO mapred.JobClient: Reduce output records=86 12/05/14 10:32:50 INFO mapred.JobClient: Map output records=106 -- Mr. Anwar Shaikh Delhi Technological University, Delhi +91 92 50 77 12 44 --20cf303bf65c26857104bff8a423--