Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4E7211081D for ; Mon, 6 May 2013 14:45:58 +0000 (UTC) Received: (qmail 14257 invoked by uid 500); 6 May 2013 14:45:53 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 14060 invoked by uid 500); 6 May 2013 14:45:52 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 14052 invoked by uid 99); 6 May 2013 14:45:52 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 May 2013 14:45:52 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of rahul.rec.dgp@gmail.com designates 209.85.128.174 as permitted sender) Received: from [209.85.128.174] (HELO mail-ve0-f174.google.com) (209.85.128.174) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 May 2013 14:45:47 +0000 Received: by mail-ve0-f174.google.com with SMTP id pb11so3210204veb.19 for ; Mon, 06 May 2013 07:45:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:from:date:message-id:subject:to :content-type; bh=LEv12XLMbS5S0ra/cg/GkIaP7hqU1grp1Gvzw/SBiuw=; b=FY3DMEhOB79q8/izxG7Yzggl5M6dXpArbFf6PUhiw1xc6Kz8k+AJXAZgKccwNKeD4O YrfjoZ0UfAsXBLnuitjrO6/31x9mB9pBrxMHvod+sgP15Q/7X72OjYYj9wrvs3Sr1PDc MmKkvyIw7/8m0bRvq9E1ORdng0vo8USv+KW/uOrvu+Y9uprSuvOxRuYPWdXnhwzjHCis Rpe60vKRVT7tteGpIBBV2R/K3hFUyBuOZPwKEcqRo39SBXN7xTglHuFJue30WEr0aJ0w akI3Iccmq5eOBLAGfub5ETPaYrflb7r5eTW87kl8HacwIoOAm+7Hh8l8mKEoHNZl3pHM /Z8g== X-Received: by 10.52.179.105 with SMTP id df9mr5982830vdc.49.1367851526309; Mon, 06 May 2013 07:45:26 -0700 (PDT) MIME-Version: 1.0 Received: by 10.59.1.170 with HTTP; Mon, 6 May 2013 07:45:06 -0700 (PDT) From: Rahul Bhattacharjee Date: Mon, 6 May 2013 20:15:06 +0530 Message-ID: Subject: Uber Job! To: "user@hadoop.apache.org" Content-Type: multipart/alternative; boundary=bcaec517234d54ceab04dc0dc09d X-Virus-Checked: Checked by ClamAV on apache.org --bcaec517234d54ceab04dc0dc09d Content-Type: text/plain; charset=UTF-8 Hi, I was going through the definition of Uber Job of Hadoop. A job is considered uber when it has 10 or less maps , one reducer and the complete data is less than one dfs block size. I have some doubts here- Splits are created as per the dfs block size.Creating 10 mappers are possible from one block of data by some settings change (changing the max split size). But trying to understand , why would some job need to run around 10 maps for 64 MB of data. One thing may be that the job is immensely CUP intensive. Will it be a correct assumption? or is there is any other reason for this. Thanks, Rahul --bcaec517234d54ceab04dc0dc09d Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi,

I was going= through the definition of Uber Job of Hadoop.

A job is considered uber when it has 10 or less ma= ps , one reducer and the complete data is less than one dfs block size.

I have some doubts here-

Splits are created as per the dfs block size.Creating 10 mappers are possib= le from one block of data by some settings change (changing the max split s= ize). But trying to understand , why would some job need to run around 10 m= aps for 64 MB of data.
One thing may be that the job is immensely CUP intensi= ve. Will it be a correct assumption? or is there is any other reason for th= is.

Thanks,
Rahul


<= /div>
--bcaec517234d54ceab04dc0dc09d--