Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 54688 invoked from network); 6 Jan 2010 05:56:09 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 6 Jan 2010 05:56:09 -0000 Received: (qmail 32221 invoked by uid 500); 6 Jan 2010 05:56:06 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 32148 invoked by uid 500); 6 Jan 2010 05:56:06 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 32138 invoked by uid 99); 6 Jan 2010 05:56:06 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Jan 2010 05:56:06 +0000 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [202.165.103.48] (HELO web15903.mail.cnb.yahoo.com) (202.165.103.48) by apache.org (qpsmtpd/0.29) with SMTP; Wed, 06 Jan 2010 05:55:56 +0000 Received: (qmail 21840 invoked by uid 60001); 6 Jan 2010 05:55:32 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com.cn; s=s1024; t=1262757332; bh=X0MsX8HFLT/zmFthMLexw7zFnlvSIPREeKcLT6HWYCU=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=Uuerm8hNlPKZenCy5GTtAyIH6331P1ZbnjaIL3zPcIy+RZjUdLwlq+zbKyZwtpsLI08BZs+MIs9EJHzzvr7H9jNII51YXu2I6xM/gdp47D8nF9r5zu/Cer0Nc3XrL5aO1NaPgGoJ5Tlsxn5pGJiNQswvRxKVG04iOaIQIH84HHo= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com.cn; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=E/L01KlMSMbik+yhGZS7Wj+/7OTragg9cyOQebRPs3GnwuZxDkveVu5lZf+kylUo7J5CadbijWgxxJqdz19dLwbkWIgaz5Kd5dFRGSdWCV501RA1Gfjlefk/G8Tf5pSyTtjj7Y9Ka+UTlZM97mQHDUGQUEQICZBpYeJ1uv0sSbM=; Message-ID: <148081.21313.qm@web15903.mail.cnb.yahoo.com> X-YMail-OSG: 4e2oVJEVM1m2jzEhxC8FihxT4q63hSj5tmL4QDttI.8crBkUG_aKXIYBg_0lIPPaZakJD.GgQ2reLQt.YPdBlNG6Ofm6o3VKTLMiNrbIjRo8IjKvqM4QW2P2h.OSGp07ENfiT.DbklLmOLXnd1E4mBgzbv5dx1kBIA4Z5KXIneuoD7ChPxRG.S_K5wb8m_IjKu2G5hHAUDc9FD1wCkFosWvev55mOQtV23X.6d1P3_q5vGuQ7owJKKCAv.HWwUYkorPZ_k5ZxMIqgJAqwDJZZzao03jv8XzjiJIT.Czb.gg0DRBZToAmWZ9pfGwaM7s- Received: from [66.57.6.89] by web15903.mail.cnb.yahoo.com via HTTP; Wed, 06 Jan 2010 13:55:32 CST X-Mailer: YahooMailRC/240.3 YahooMailWebService/0.8.100.260964 References: <85865e431001051312l2a0649a7i5ff3c3a2c4efbe53@mail.gmail.com> <662493.64083.qm@web15908.mail.cnb.yahoo.com> Date: Wed, 6 Jan 2010 13:55:32 +0800 (CST) From: Gang Luo Subject: Re: combiner statistics To: common-user@hadoop.apache.org In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=gb2312 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Thanks. What I mean is, the combiner doesn't "intentionally" re-read spille= d records back to memory just to combine them. But it does happens that som= e records will be re-read for sort. I think combiner should work on those r= ecords.=0A=0A =0A-Gang=0A=0A=0A=0A----- =D4=AD=CA=BC=D3=CA=BC=FE ----=0A=B7= =A2=BC=FE=C8=CB=A3=BA Ted Xu =0A=CA=D5=BC=FE=C8=CB=A3= =BA common-user@hadoop.apache.org=0A=B7=A2=CB=CD=C8=D5=C6=DA=A3=BA 2010/1/5= (=D6=DC=B6=FE) 8:43:53 =CF=C2=CE=E7=0A=D6=F7 =CC=E2=A3=BA Re: combiner s= tatistics=0A=0AHi Gang,=0A=0AMy understanding to this is that, the combiner= has to re-read some records=0A> which have already been spilled to disk an= d combine them with those records=0A> which come later.=0A>=0A=0AI believe = the combine operation is done before map spill and after reduce=0Amerge. Co= mbine only occurs in the memory, instead of re-read records from=0Adisks.= =0A=0A=0A> Besides, I am not sure whether the combiner can guarantee there = is only one=0A> record for each distinct key in each map task. Or does it j= ust "try its=0A> best" to combine?=0A>=0A=0AYes, they can only "try their b= est".=0A=0A=0A=0A ____________________________________________________= _______ =0A =BA=C3=CD=E6=BA=D8=BF=A8=B5=C8=C4=E3=B7=A2=A3=AC=D3=CA=CF=E4= =BA=D8=BF=A8=C8=AB=D0=C2=C9=CF=CF=DF=A3=A1 =0Ahttp://card.mail.cn.yahoo.com= /