Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D6AD71184F for ; Fri, 16 May 2014 15:52:11 +0000 (UTC) Received: (qmail 77475 invoked by uid 500); 16 May 2014 15:18:20 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 50788 invoked by uid 500); 16 May 2014 14:53:26 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 26481 invoked by uid 99); 16 May 2014 14:34:43 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 May 2014 14:34:43 +0000 X-ASF-Spam-Status: No, hits=2.4 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of java8964@hotmail.com designates 65.54.61.91 as permitted sender) Received: from [65.54.61.91] (HELO snt0-omc2-s40.snt0.hotmail.com) (65.54.61.91) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 May 2014 14:28:29 +0000 Received: from SNT149-W1 ([65.55.90.71]) by snt0-omc2-s40.snt0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Fri, 16 May 2014 07:28:08 -0700 X-TMN: [mc9vX8Rf8XpD3El4fRialSh3d34E9Ah6evU4P48o9tc=] X-Originating-Email: [java8964@hotmail.com] Message-ID: Content-Type: multipart/alternative; boundary="_df58efd0-a62f-4582-9060-66f3bee24867_" From: java8964 To: "user@hadoop.apache.org" Subject: RE: spilled records Date: Fri, 16 May 2014 10:28:08 -0400 Importance: Normal In-Reply-To: References: MIME-Version: 1.0 X-OriginalArrivalTime: 16 May 2014 14:28:08.0429 (UTC) FILETIME=[0ED681D0:01CF7113] X-Virus-Checked: Checked by ClamAV on apache.org --_df58efd0-a62f-4582-9060-66f3bee24867_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Your first understanding is not correct. Where do you get that interruption= from the book? About the #spilled records=2C every record of output of mapper will be spil= led at least one time.So in ideal scenario=2C these 2 numbers should be equ= al. If they are not=2C and spilled number is much larger than the records c= ount of output of mappers=2C then you maybe need to adjust "io.sort.mb" con= figuration. Yong=20 From: yu_libo@hotmail.com To: user@hadoop.apache.org Subject: spilled records Date: Thu=2C 8 May 2014 21:17:35 -0400 =0A= =0A= =0A= Hi=2C=20 According to ""Hadoop: the definitive guide"=2C when mapreduce.job.shuffle.= input.buffer.percent is=20 large enough=2C the map outputs are copied directly into the reduce JVM mem= ory. I set this parameter to 0.5 which is large enough to hold map outputs=2C bu= t #spilled records is still the same=20 as reduce input records. Anybody knows why? Thanks. Libo = --_df58efd0-a62f-4582-9060-66f3bee24867_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Your first understanding is not = correct. Where do you get that interruption from the book?

About the #spilled records=2C every record of output of mapper will be s= pilled at least one time.
So in ideal scenario=2C these 2 numbers= should be equal. If they are not=2C and spilled number is much larger than= the records count of output of mappers=2C then you maybe need to adjust "i= o.sort.mb" configuration.

Yong =3B


From: yu_libo@hotmail.com
To: user@hadoop.apache.orgSubject: spilled records
Date: Thu=2C 8 May 2014 21:17:35 -0400
=0A= =0A= =0A=
Hi=2C

According to ""Hadoop: the definitive guide"= =2C when mapreduce.job.shuffle.input.buffer.percent is
large enough=2C = the map outputs are copied directly into the reduce JVM memory.

I se= t this parameter to 0.5 which is large enough to hold map outputs=2C but #s= pilled records is still the same
as reduce input records. =3B Anybo= dy knows why? Thanks.

Libo


= = --_df58efd0-a62f-4582-9060-66f3bee24867_--