Return-Path: Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: (qmail 25608 invoked from network); 28 Jan 2010 16:13:59 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 28 Jan 2010 16:13:59 -0000 Received: (qmail 13923 invoked by uid 500); 28 Jan 2010 16:13:59 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 13865 invoked by uid 500); 28 Jan 2010 16:13:58 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 13827 invoked by uid 99); 28 Jan 2010 16:13:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Jan 2010 16:13:58 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Jan 2010 16:13:56 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 1FBD6234C1F0 for ; Thu, 28 Jan 2010 08:13:35 -0800 (PST) Message-ID: <1459659300.107481264695215128.JavaMail.jira@brutus.apache.org> Date: Thu, 28 Jan 2010 16:13:35 +0000 (UTC) From: "Xing Shi (JIRA)" To: mapreduce-issues@hadoop.apache.org Subject: [jira] Updated: (MAPREDUCE-1424) prevent Merger fd leak when there are lots empty segments in mem In-Reply-To: <678121473.107341264695096784.JavaMail.jira@brutus.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/MAPREDUCE-1424?page=3Dcom.atla= ssian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xing Shi updated MAPREDUCE-1424: -------------------------------- Description:=20 The Merger will open too many files on disk, when there are too many empty = segments in shuffle mem. We process larger data , eg. > 100T=EF=BC=8Cin one Job. And we use our part= itioner to partition the map output=EF=BC=8Cand one map output will wholely= shuffle to one reduce=E3=80=82So the other reduce will get lots of empty s= egments. ' whole 'mapOutput_n | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3---> reduce1 ' | empty =20 ' | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3---> re= duce2 ' | empty ' | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3---> re= duce3 ' | empty ' | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3---> re= duce4 Because, our input data is bigger, so there are lots of map(10^5). And most= ly there are several thousands maps to one reduce, and several thousands em= pty segments.=20 For example: 1000 mapOutput(on disk) + 3000 empty segments(in mem) Then, as the io.sort.factor=3D100 in first merge cycle, the merger will merge 10+3000 segments [ by getPa= ssFactor (1000 - 1)%100 + 1 + 30000 ]=EF=BC=8Cbecause there is no real data= in mem, then we should use the left 990 mapOutput to replace the empty 300= 0 mem segments, then we open 1000 fd. Once there are several reduce on one taskTracker, we will open several = thousand fds. I think we can use first collection to remove the empty segments, moreo= ver in shuffle phase, we also can not add the segment into mem. was: The Merger will open too many files on disk, when there are too many empty = segments in shuffle mem. We process larger data , eg. > 100T=EF=BC=8Cin one Job. And we use our part= itioner to partition the map output=EF=BC=8Cand one map output will wholely= shuffle to one reduce=E3=80=82So the other reduce will get lots of empty s= egments. " whole "map_n | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3---> reduce1 " | empty =20 " | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3---> reduce2 " | empty " | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3---> reduce3 " | empty " | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3---> reduce4 Because, our input data is bigger, so there are lots of map(10^5). And most= ly there are several thousands maps to one reduce, and several thousands em= pty segments.=20 For example: 1000 mapOutput(on disk) + 3000 empty segments(in mem) Then, as the io.sort.factor=3D100 in first merge cycle, the merger will merge 10+3000 segments [ by getPa= ssFactor (1000 - 1)%100 + 1 + 30000 ]=EF=BC=8Cbecause there is no real data= in mem, then we should use the left 990 mapOutput to replace the empty 300= 0 mem segments, then we open 1000 fd. Once there are several reduce on one taskTracker, we will open several = thousand fds. I think we can use first collection to remove the empty segments, moreo= ver in shuffle phase, we also can not add the segment into mem. > prevent Merger fd leak when there are lots empty segments in mem > ----------------------------------------------------------------- > > Key: MAPREDUCE-1424 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1424 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: task > Reporter: Xing Shi > > The Merger will open too many files on disk, when there are too many empt= y segments in shuffle mem. > We process larger data , eg. > 100T=EF=BC=8Cin one Job. And we use our pa= rtitioner to partition the map output=EF=BC=8Cand one map output will whole= ly shuffle to one reduce=E3=80=82So the other reduce will get lots of empty= segments. > ' whole > 'mapOutput_n | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3---> reduce1 > ' | empty =20 > ' | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3---> = reduce2 > ' | empty > ' | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3---> = reduce3 > ' | empty > ' | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3---> = reduce4 > Because, our input data is bigger, so there are lots of map(10^5). And mo= stly there are several thousands maps to one reduce, and several thousands = empty segments.=20 > For example: > 1000 mapOutput(on disk) + 3000 empty segments(in mem) > Then, as the io.sort.factor=3D100 > in first merge cycle, the merger will merge 10+3000 segments [ by get= PassFactor (1000 - 1)%100 + 1 + 30000 ]=EF=BC=8Cbecause there is no real da= ta in mem, then we should use the left 990 mapOutput to replace the empty 3= 000 mem segments, then we open 1000 fd. > Once there are several reduce on one taskTracker, we will open severa= l thousand fds. > I think we can use first collection to remove the empty segments, mor= eover in shuffle phase, we also can not add the segment into mem. --=20 This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.