Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D104D111FD for ; Fri, 27 Jun 2014 07:08:12 +0000 (UTC) Received: (qmail 86845 invoked by uid 500); 27 Jun 2014 07:08:10 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 86770 invoked by uid 500); 27 Jun 2014 07:08:10 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 86758 invoked by uid 99); 27 Jun 2014 07:08:10 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Jun 2014 07:08:10 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of vajjalak009@gmail.com designates 209.85.212.175 as permitted sender) Received: from [209.85.212.175] (HELO mail-wi0-f175.google.com) (209.85.212.175) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Jun 2014 07:08:07 +0000 Received: by mail-wi0-f175.google.com with SMTP id r20so2277231wiv.8 for ; Fri, 27 Jun 2014 00:07:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=exOKPQKqZ8pyb5ESTyJwhcv4/IQ1WmpGCOYKzoJ9cwU=; b=EQbNNsHVW703I1FQ6WkDoUgqG6r7bepYxk6LYR+nUt2uEn9Ma552l7lz6eQ/ttGaC/ ISSW+Wu5pJuXBN+TetzxXF3hj35dLnVfJvdHjQ1AYshbv0d5/GYCApw5UOZqvAb7BxUC 2CrIZbTCLuTUrb7QGEJHVEUX98f1K/HhHXP8q494nfrR5BhwDrJjWzLJacroEm1YB395 Roy8/4x6dxFOUUmAWrUYHMaoqXB2N4yQ5mof3rmYl2mV2qqYoX4JYiByG1O4+r7PtQ1D pQewltV7xUSlnhtrmM87ykn+Tvmr5ouNqsPFbl+oCIkonHCt7LLcto9BkHUgR8ibioJ9 tj6Q== MIME-Version: 1.0 X-Received: by 10.180.83.105 with SMTP id p9mr9622585wiy.8.1403852863172; Fri, 27 Jun 2014 00:07:43 -0700 (PDT) Received: by 10.216.75.70 with HTTP; Fri, 27 Jun 2014 00:07:43 -0700 (PDT) Date: Fri, 27 Jun 2014 00:07:43 -0700 Message-ID: Subject: Basic question regarding calculating number of reducers From: KayVajj To: user@hive.apache.org Content-Type: multipart/alternative; boundary=f46d044282de39f99c04fccbf7b8 X-Virus-Checked: Checked by ClamAV on apache.org --f46d044282de39f99c04fccbf7b8 Content-Type: text/plain; charset=UTF-8 Hi, I have a basic question regarding the calculation of the number of reducers in hive. I know that is computed as /. In case of compressed files it is not clear whether total input size is calculated when compressed or decompressed. Doesn't it make a significant difference if calculated when compressed. I have tried checking the API but it looks like it is getting the file sizes from the Namenode using the getFileInfo API call. This tells me that it calculated when compressed. Kay --f46d044282de39f99c04fccbf7b8 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi,

I have a basic question regarding t= he calculation of the number of reducers in hive. I know that is computed a= s <Total-Input-Size>/<Bytes-Per-Reducer>.=C2=A0

<= /div>
In case of compressed files it is not clear whether total input size i= s calculated when compressed or decompressed. Doesn't it make a signifi= cant difference if calculated when compressed. I have tried checking the AP= I but it looks like it is getting the file sizes from the Namenode using th= e getFileInfo API call. This tells me that it calculated when compressed.

Kay
--f46d044282de39f99c04fccbf7b8--