Return-Path: Delivered-To: apmail-hadoop-common-dev-archive@www.apache.org Received: (qmail 33813 invoked from network); 24 Jun 2010 00:43:16 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 24 Jun 2010 00:43:16 -0000 Received: (qmail 76456 invoked by uid 500); 24 Jun 2010 00:43:15 -0000 Delivered-To: apmail-hadoop-common-dev-archive@hadoop.apache.org Received: (qmail 76387 invoked by uid 500); 24 Jun 2010 00:43:14 -0000 Mailing-List: contact common-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-dev@hadoop.apache.org Delivered-To: mailing list common-dev@hadoop.apache.org Received: (qmail 76378 invoked by uid 99); 24 Jun 2010 00:43:14 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Jun 2010 00:43:14 +0000 X-ASF-Spam-Status: No, hits=4.4 required=10.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of carp84@gmail.com designates 209.85.216.176 as permitted sender) Received: from [209.85.216.176] (HELO mail-qy0-f176.google.com) (209.85.216.176) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Jun 2010 00:43:08 +0000 Received: by qyk33 with SMTP id 33so272315qyk.35 for ; Wed, 23 Jun 2010 17:42:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=sUiIG2RyAH8ybHzuXXu+4n/Yg7S1qFD35Dmq6FpAxq0=; b=LQc1cKkHWlXTlo/6mt10KyHzth8j1JjjVRy0UZpGK/5MszS9azTtenBWB5/3XpEulP wnls300tJPIee89POLskmLv3zTcRBe0sfbkiCqKW2JA+OxK/gChV1VE60SSjk7sqCizJ kXO5Oz7P8hs/M4x917T7kumegtY4U2kJ5+vFc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=tcTenpHsy17+Elck+/hVQQKS/crrpKi9Rw4sg1zEEHJMFbnTtz85nJyEb9a+Kl0Dy7 noie2s0AUEki0iajunZvHXwCUjqnQBIv5bFJgi3+LCfnkgrTU7MeiZ2GiYnbMNzc3jMt 8EzV2QCgj74XbhprIQZN3hZnzFSl+PPRgL5T0= MIME-Version: 1.0 Received: by 10.224.18.163 with SMTP id w35mr5676794qaa.70.1277340167063; Wed, 23 Jun 2010 17:42:47 -0700 (PDT) Received: by 10.224.67.129 with HTTP; Wed, 23 Jun 2010 17:42:46 -0700 (PDT) In-Reply-To: References: <46A377B1A3A3074D8B989BF96663C10DF78EEA4A97@EGL-EX07VS01.ds.corp.yahoo.com> Date: Thu, 24 Jun 2010 08:42:46 +0800 Message-ID: Subject: Re: Questions about recommendation value of the "io.sort.mb" parameter From: Yu Li To: common-dev@hadoop.apache.org Content-Type: multipart/alternative; boundary=00c09fa543e9ea4ba00489bbee84 X-Virus-Checked: Checked by ClamAV on apache.org --00c09fa543e9ea4ba00489bbee84 Content-Type: text/plain; charset=GB2312 Content-Transfer-Encoding: quoted-printable Hi Todd, Thanks a lot for your further explanation! It makes me more clear about thi= s parameter. BTW, please allow me to express my thankfulness to everyone helps. Best Regards, Carp =D4=DA 2010=C4=EA6=D4=C224=C8=D5 =C9=CF=CE=E71:49=A3=ACTodd Lipcon =D0=B4=B5=C0=A3=BA > Plus there's some overhead for each record of map output. Specifically, 2= 4 > bytes. So if you output 64MB worth of data, but each of your objects is > only > 24 bytes long itself, you need more than 128MB worth of spill space for i= t. > Last, the map output buffer begins spilling when it is partially full so > that more records can be collected while spill proceeds. > > 200MB io.sort.mb has enough headroom for most 64M input splits that don't > blow up the data a lot. Expanding much above 200M for most jobs doesn't b= uy > you much. Good news is it's easy to tell by looking at the logs to see ho= w > many times the map tasks are spilling. If you're only spilling once, more > io.sort.mb will not help. > > -Todd > > 2010/6/23 =C0=EE=EE=DA > > > Hi Jeff, > > > > Thanks for your quick reply. Seems my thinking is stuck on the job styl= e > > I'm > > running. Now I'm much clearer about it. > > > > Best Regards, > > Carp > > > > 2010/6/23 Jeff Zhang > > > > > Hi =C0=EE=EE=DA > > > > > > The size of map output depends on your Mapper class. The Mapper class > > > will do processing on the input data. > > > > > > > > > > > > 2010/6/23 =C0=EE=EE=DA : > > > > Hi Sriguru, > > > > > > > > Thanks a lot for your comments and suggestions! > > > > Here I still have some questions: since map mainly do data > preparation, > > > > say split input data into KVPs, sort and partition before spill, > would > > > the > > > > size of map output KVPs be much larger than the input data size? If > > not, > > > > since one map task deals with one input split, and one input split = is > > > > usually 64M, the map KVPs size would be proximately 64M. Could you > > please > > > > give me some example on map output much larger than the input split= ? > It > > > > really confuse me for some time, thanks. > > > > > > > > Others, > > > > > > > > Also badly need your help if you know about this, thanks. > > > > > > > > Best Regards, > > > > Carp > > > > > > > > =D4=DA 2010=C4=EA6=D4=C223=C8=D5 =CF=C2=CE=E75:11=A3=ACSrigurunath = Chakravarthi > >=D0=B4=B5=C0=A3=BA > > > > > > > >> Hi Carp, > > > >> Your assumption is right that this is a per-map-task setting. > > > >> However, this buffer stores map output KVPs, not input. Therefore > the > > > >> optimal value depends on how much data your map task is generating= . > > > >> > > > >> If your output per map is greater than io.sort.mb, these rules of > > thumb > > > >> that could work for you: > > > >> > > > >> 1) Increase max heap of map tasks to use RAM better, but not hit > swap. > > > >> 2) Set io.sort.mb to ~70% of heap. > > > >> > > > >> Overall, causing extra "spills" (because of insufficient io.sort.m= b) > > is > > > >> much better than risking swapping (by setting io.sort.mb and heap > too > > > >> large), in terms of relative performance penalty you will pay. > > > >> > > > >> Cheers, > > > >> Sriguru > > > >> > > > >> >-----Original Message----- > > > >> >From: =C0=EE=EE=DA [mailto:carp84@gmail.com] > > > >> >Sent: Wednesday, June 23, 2010 12:27 PM > > > >> >To: common-dev@hadoop.apache.org > > > >> >Subject: Questions about recommendation value of the "io.sort.mb" > > > >> >parameter > > > >> > > > > >> >Dear all, > > > >> > > > > >> >Here I've got a question about the "io.sort.mb" parameter. We can > > find > > > >> >material from Yahoo! or Cloudera which recommend setting this val= ue > > to > > > >> >200 > > > >> >if the job scale is large, but I'm confused about this. As I know= , > > > >> >the tasktracker will launch a child-JVM for each task, and > > > >> >=A1=B0*io.sort.mb*=A1=B1 > > > >> >presents the buffer size in memory inside *one map task child-JVM= *, > > the > > > >> >default value 100MB should be large enough because the input spli= t > of > > > >> >one > > > >> >map task is usually 64MB, as large as the block size we usually > set. > > > >> >Then > > > >> >why the recommendation of =A1=B0*io.sort.mb*=A1=B1 is 200MB for l= arge jobs > (and > > > >> >it > > > >> >really works)? How could the job size affect the procedure? > > > >> >Is there any fault here of my understanding? Any comment/suggesti= on > > > >> >will be > > > >> >highly valued, thanks in advance. > > > >> > > > > >> >Best Regards, > > > >> >Carp > > > >> > > > > > > > > > > > > > > > > -- > > > Best Regards > > > > > > Jeff Zhang > > > > > > > > > -- > Todd Lipcon > Software Engineer, Cloudera > --00c09fa543e9ea4ba00489bbee84--