Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D744BDE20 for ; Wed, 5 Dec 2012 16:24:20 +0000 (UTC) Received: (qmail 29244 invoked by uid 500); 5 Dec 2012 16:03:20 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 28919 invoked by uid 500); 5 Dec 2012 16:03:11 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 28475 invoked by uid 99); 5 Dec 2012 16:02:56 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Dec 2012 16:02:56 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of x6i4uyzbz.labs@gmail.com designates 209.85.210.176 as permitted sender) Received: from [209.85.210.176] (HELO mail-ia0-f176.google.com) (209.85.210.176) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Dec 2012 16:02:47 +0000 Received: by mail-ia0-f176.google.com with SMTP id k32so4348617iak.35 for ; Wed, 05 Dec 2012 08:02:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:x-google-sender-delegation:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=iIEc0F1yreVapUsSULQQCP+iQ9oaZXy/Q4kE6cnQ2Nw=; b=uHvzEtf+ThqyAM7Ef5DbroXK2bZ8TQeSMPPqh+yEuMR+MtWxnF2RQnpAUPXI6WQ2Zq aA+x0tKTOnXlcWjNTrqPFylWvG/d4sQAWfLkaDQP2Hyz0ZjgWYmXVRolnx49oYaB8Vzk 4J2c8T5V6Sgxfa/D0Jh+zIM9Be+CrAXAs4VIs75penxqR/BgNbnBnulWEADuzYlm/yxk bEPoNPOg418shAnbN0YBKMFXDA6xVhVX/NN2GOhlWN0fLk+BGMxFtuC2aFpjfFlRjnq2 e6fQLK9RjEDnyUJhSWh4we8qNtcLOOn7KX7jL3MmPdYbz6twO2toQwpPqOQk7lJwSctV zhSg== MIME-Version: 1.0 Received: by 10.50.170.1 with SMTP id ai1mr2480905igc.30.1354723346032; Wed, 05 Dec 2012 08:02:26 -0800 (PST) Sender: gpolaert@gmail.com X-Google-Sender-Delegation: gpolaert@gmail.com Received: by 10.50.51.166 with HTTP; Wed, 5 Dec 2012 08:02:25 -0800 (PST) Date: Wed, 5 Dec 2012 17:02:25 +0100 X-Google-Sender-Auth: -ucxplZFFOeyTolyQmehUkT_Sxo Message-ID: Subject: M/R, Strange behavior with multiple Gzip files From: x6i4uybz labs To: user@hadoop.apache.org Cc: gpolaert@cyres.fr Content-Type: multipart/alternative; boundary=e89a8f234507cf286a04d01d1bed X-Virus-Checked: Checked by ClamAV on apache.org --e89a8f234507cf286a04d01d1bed Content-Type: text/plain; charset=ISO-8859-1 Hi everybody, I have a M/R job which does a bulk import to hbase. I have to process many gzip files (2800 x ~ 100mb) I don't understand why my job instanciates 80 maps but runs each map sequentialy like if there is only one big gz file. Is there a problem in my driver ? Or maybe I miss something. I use "FileInputFormat.addInputPath(job, new Path(args[0]))" where args[0] is a directory. Can you help me, please ? Thanks, Guillaume --e89a8f234507cf286a04d01d1bed Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi everybody,

I have a M/R job which does a bulk import = to hbase.
I have to process many gzip files (2800 x ~ 100mb)=A0

I don't understand why my job instanciates 80 m= aps but runs each map sequentialy like if there is only one big gz file.

Is there a problem in my driver ? Or maybe I miss somet= hing.=A0
I use "FileInputFormat.addInputPath(job, new Path(a= rgs[0]))" where args[0] is a directory.

Can y= ou help me, please ?

Thanks, Guillaume
--e89a8f234507cf286a04d01d1bed--