Return-Path: Delivered-To: apmail-hadoop-mapreduce-dev-archive@minotaur.apache.org Received: (qmail 66602 invoked from network); 4 Feb 2011 03:23:19 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 4 Feb 2011 03:23:19 -0000 Received: (qmail 38946 invoked by uid 500); 4 Feb 2011 03:23:19 -0000 Delivered-To: apmail-hadoop-mapreduce-dev-archive@hadoop.apache.org Received: (qmail 38579 invoked by uid 500); 4 Feb 2011 03:23:16 -0000 Mailing-List: contact mapreduce-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-dev@hadoop.apache.org Delivered-To: mailing list mapreduce-dev@hadoop.apache.org Received: (qmail 38570 invoked by uid 99); 4 Feb 2011 03:23:15 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Feb 2011 03:23:15 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of yuzhihong@gmail.com designates 209.85.161.48 as permitted sender) Received: from [209.85.161.48] (HELO mail-fx0-f48.google.com) (209.85.161.48) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Feb 2011 03:23:08 +0000 Received: by fxm2 with SMTP id 2so1977169fxm.35 for ; Thu, 03 Feb 2011 19:22:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=PaF/iNsLvF7X9Mc8Sm5e3PiH0WicucLTOXyL/FVu7T0=; b=kY7BPYUJhhBlD/9J8a8ZhD26jRoNcFj/a1cVmhRMaqisXIzcZFBxOcSCt8nX5+Ni9f wCknXSyYTLv/p18TMe+Nm9yRytjDK9nStxVIpZXznvIma3fAkzwFs5exaPge0LUsKKGY 7LYfGh3hPvdBhJ5xICWmWsXmUPRG1YLUPu4eo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=FBdqbjYpqyESXDlFcidHwlbf/jeYH/sAN7KHk0Q41e4Y/z2KBeR6GK7k6/mWvv17GL dSorO+vE+oNI56YXUCMm6JX33QX+2AGrrgMk3xzKFrIjxJ4HF2jjGg982EvclzmI0wA/ TnJ7IOkR35PN4oNiN80BzdP975jJzMYvfw47M= MIME-Version: 1.0 Received: by 10.223.70.142 with SMTP id d14mr5615422faj.110.1296789766974; Thu, 03 Feb 2011 19:22:46 -0800 (PST) Received: by 10.223.78.140 with HTTP; Thu, 3 Feb 2011 19:22:46 -0800 (PST) In-Reply-To: References: Date: Thu, 3 Feb 2011 19:22:46 -0800 Message-ID: Subject: Re: "Map input bytes" vs HDFS_BYTES_READ From: Ted Yu To: mapreduce-dev@hadoop.apache.org Content-Type: multipart/alternative; boundary=00248c0ef2f268af54049b6c65d2 X-Virus-Checked: Checked by ClamAV on apache.org --00248c0ef2f268af54049b6c65d2 Content-Type: text/plain; charset=ISO-8859-1 >From my limited experiment, I think "Map input bytes" reflects the number of bytes of local data file(s) when LocalJobRunner is used. Correct me if I am wrong. On Tue, Feb 1, 2011 at 7:52 PM, Harsh J wrote: > Each task counts independently of its attempt/other tasks, thereby > making the aggregates easier to control. Final counters are aggregated > only from successfully committed tasks. During the job's run, however, > counters are shown aggregated from the most successful attempts of a > task thus far. > > On Wed, Feb 2, 2011 at 9:09 AM, Ted Yu wrote: > > If map task(s) were retried (mapred.map.max.attempts times), how would > these > > two counters be affected ? > > > > Thanks > > > > On Tue, Feb 1, 2011 at 7:31 PM, Harsh J wrote: > > > >> HDFS_BYTES_READ is a FileSystem interface counter. It directly deals > >> with the FS read (lower level). Map input bytes is what the > >> RecordReader has processed in number of bytes for records being read > >> from the input stream. > >> > >> For plain text files, I believe both counters must report about the > >> same value, were entire records being read with no operation performed > >> on each line. But when you throw in a compressed file, you'll notice > >> that the HDFS_BYTES_READ would be far lesser than Map input bytes > >> since the disk read was low, but the total content stored in record > >> terms was still the same as it would be for an uncompressed file. > >> > >> Hope this clears it. > >> > >> On Wed, Feb 2, 2011 at 8:06 AM, Ted Yu wrote: > >> > In hadoop 0.20.2, what's the relationship between "Map input bytes" > and > >> > HDFS_BYTES_READ ? > >> > > >> > >> > name="HDFS_BYTES_READ">203446204073 > >> > >> > name="HDFS_BYTES_WRITTEN">23413127561 > >> > 163502600 > >> > 0 > >> > 965922136488 > >> > 296754600 > >> > > >> > Thanks > >> > > >> > >> > >> > >> -- > >> Harsh J > >> www.harshj.com > >> > > > > > > -- > Harsh J > www.harshj.com > --00248c0ef2f268af54049b6c65d2--