Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 45E6BC7AA for ; Wed, 11 Jul 2012 12:48:30 +0000 (UTC) Received: (qmail 55083 invoked by uid 500); 11 Jul 2012 12:48:28 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 55034 invoked by uid 500); 11 Jul 2012 12:48:28 -0000 Mailing-List: contact mapreduce-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-user@hadoop.apache.org Delivered-To: mailing list mapreduce-user@hadoop.apache.org Received: (qmail 55021 invoked by uid 99); 11 Jul 2012 12:48:28 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Jul 2012 12:48:28 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of manoj444@gmail.com designates 209.85.214.48 as permitted sender) Received: from [209.85.214.48] (HELO mail-bk0-f48.google.com) (209.85.214.48) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Jul 2012 12:48:21 +0000 Received: by bkty5 with SMTP id y5so980178bkt.35 for ; Wed, 11 Jul 2012 05:48:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=57O7Loek2q8/yKyHVDcJdJfIo1LU0+Ne+UXEO5duHps=; b=WucgrPzFWCd7ZOFy8Q0WoAUWmpQEgiyPssM1FX/1YXrjM6nZsatmI5SgpBMmNI+Vp5 77bNJkLTSkwAMUc7R6dLuuUseGUcR8AxI0gwL0L0YfSMcQmg9lVAE9IELMdekKcxXXeb s/9RRVTjFJR+R5AyFCya+aQN0nNFVEA+vjdIinLacWJgPzRtnhchhPfrQ1j65wqou2+B RBfKQYUd2N28ADlZJSstmdqCOpGdwiSSVLfOKyAgqLWkonQxrmKImz5hMCtUAcA4sfBa AM/JMmKVIBycZIDvZYpKo61GfOyYBf0izcGCdaZ+vNSmNz+rpA9NtBwb+1N5dVdlA5uZ NNDQ== Received: by 10.205.133.11 with SMTP id hw11mr9658129bkc.46.1342010881350; Wed, 11 Jul 2012 05:48:01 -0700 (PDT) MIME-Version: 1.0 Received: by 10.204.71.1 with HTTP; Wed, 11 Jul 2012 05:47:41 -0700 (PDT) In-Reply-To: References: From: Manoj Babu Date: Wed, 11 Jul 2012 18:17:41 +0530 Message-ID: Subject: Re: Mapper basic question To: mapreduce-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=000e0ce03e0ade1d3104c48d41d4 --000e0ce03e0ade1d3104c48d41d4 Content-Type: text/plain; charset=ISO-8859-1 Hi Tariq \Arun, The no of blocks(splits) = *total no of file size/hdfs block size * replicate value* The no of splits is again nothing but the blocks here. Other than increasing the block size(input splits) is it possible to limit that no of mappers? Cheers! Manoj. On Wed, Jul 11, 2012 at 6:06 PM, Arun C Murthy wrote: > Take a look at CombineFileInputFormat - this will create 'meta splits' > which include multiple small spilts, thus reducing #maps which are run. > > Arun > > On Jul 11, 2012, at 5:29 AM, Manoj Babu wrote: > > Hi, > > The no of mappers is depends on the no of blocks. Is it possible to limit > the no of mappers size without increasing the HDFS block size? > > Thanks in advance. > > Cheers! > Manoj. > > > -- > Arun C. Murthy > Hortonworks Inc. > http://hortonworks.com/ > > > --000e0ce03e0ade1d3104c48d41d4 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi=A0 Tariq=A0\Arun,

=
The no of blocks(splits) =3D total no of file size/hdfs block size = * replicate value
The no of splits is again nothing but the blocks here.

<= /div>
Other than increasing the block size(input splits) is it possible= to limit that no of mappers?


Cheers!
Manoj.



On Wed, Jul 11, 2012 at 6:06 PM, Arun C = Murthy <acm@hortonworks.com> wrote:
Take a look at CombineFileInputFormat -= this will create 'meta splits' which include multiple small spilts= , thus reducing #maps which are run.

Arun

On Jul 11, 2012, at 5:29 AM, Manoj Babu wro= te:

Hi,

The no of map= pers is depends on the no of blocks. Is it possible to limit the no of mapp= ers size without increasing the HDFS block size?

Thanks in advance.

Cheers!
Manoj.


--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



--000e0ce03e0ade1d3104c48d41d4--