From mapreduce-dev-return-2945-apmail-hadoop-mapreduce-dev-archive=hadoop.apache.org@hadoop.apache.org Wed Feb 02 03:32:21 2011 Return-Path: Delivered-To: apmail-hadoop-mapreduce-dev-archive@minotaur.apache.org Received: (qmail 2350 invoked from network); 2 Feb 2011 03:32:21 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 2 Feb 2011 03:32:21 -0000 Received: (qmail 44113 invoked by uid 500); 2 Feb 2011 03:32:20 -0000 Delivered-To: apmail-hadoop-mapreduce-dev-archive@hadoop.apache.org Received: (qmail 43760 invoked by uid 500); 2 Feb 2011 03:32:18 -0000 Mailing-List: contact mapreduce-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-dev@hadoop.apache.org Delivered-To: mailing list mapreduce-dev@hadoop.apache.org Received: (qmail 43752 invoked by uid 99); 2 Feb 2011 03:32:17 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Feb 2011 03:32:17 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of qwertymaniac@gmail.com designates 209.85.161.48 as permitted sender) Received: from [209.85.161.48] (HELO mail-fx0-f48.google.com) (209.85.161.48) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Feb 2011 03:32:10 +0000 Received: by fxm2 with SMTP id 2so8153245fxm.35 for ; Tue, 01 Feb 2011 19:31:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=bwbnzLzvBrURsYrNrboZGsmYyT05dNjcfCh92wJ8ihU=; b=UjqqKdUKJ31eAweT2vjNHGWkv3yM2WV89Oq+IIH28Sqiepab0O5d0MfWtZqTXcc8XI /tiuHEPJBflzWte9p5Y6csXIXxwBbLUFjf4748dwmSV2OPqmGXA7ol1ekXmWxxvl5Xq7 LjMuSc7LHsUussH+hA/ZJpRfZhK9Sd//RoSUs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=sOA9xdUzWvr3UyI24252Mbz4c5pwuIFtP/esFesED+GE0fVSYZLwwKLWClK7j2dp72 JPwd5Vs7cdfEUKTbfCKYSIBXvwOlykH6zHpoibNSvi5ZLW+Py8ISu10QVvaamOlxXR6/ 9C+64bUng/yuRzAhwhBkSui1BEj91Y1dGS0wY= Received: by 10.223.106.129 with SMTP id x1mr8238997fao.13.1296617510735; Tue, 01 Feb 2011 19:31:50 -0800 (PST) MIME-Version: 1.0 Received: by 10.223.124.200 with HTTP; Tue, 1 Feb 2011 19:31:30 -0800 (PST) In-Reply-To: References: From: Harsh J Date: Wed, 2 Feb 2011 09:01:30 +0530 Message-ID: Subject: Re: "Map input bytes" vs HDFS_BYTES_READ To: mapreduce-dev@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org HDFS_BYTES_READ is a FileSystem interface counter. It directly deals with the FS read (lower level). Map input bytes is what the RecordReader has processed in number of bytes for records being read from the input stream. For plain text files, I believe both counters must report about the same value, were entire records being read with no operation performed on each line. But when you throw in a compressed file, you'll notice that the HDFS_BYTES_READ would be far lesser than Map input bytes since the disk read was low, but the total content stored in record terms was still the same as it would be for an uncompressed file. Hope this clears it. On Wed, Feb 2, 2011 at 8:06 AM, Ted Yu wrote: > In hadoop 0.20.2, what's the relationship between "Map input bytes" and > HDFS_BYTES_READ ? > > name="HDFS_BYTES_READ">203446204073 > name="HDFS_BYTES_WRITTEN">23413127561 > 163502600 > 0 > 965922136488 > 296754600 > > Thanks > -- Harsh J www.harshj.com