Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: mapreduce-issues@hadoop.apache.org
Message-ID: <1459659300.107481264695215128.JavaMail.jira@brutus.apache.org>
Date: Thu, 28 Jan 2010 16:13:35 +0000 (UTC)
From: "Xing Shi (JIRA)" <jira@apache.org>
To: mapreduce-issues@hadoop.apache.org
Subject: [jira] Updated: (MAPREDUCE-1424) prevent Merger fd  leak when there
 are lots empty segments in mem
In-Reply-To: <678121473.107341264695096784.JavaMail.jira@brutus.apache.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable


     [ https://issues.apache.org/jira/browse/MAPREDUCE-1424?page=3Dcom.atla=
ssian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xing Shi updated MAPREDUCE-1424:
--------------------------------

    Description:=20
The Merger will open too many files on disk, when there are too many empty =
segments in shuffle mem.


We process larger data , eg. > 100T=EF=BC=8Cin one Job. And we use our part=
itioner to partition the map output=EF=BC=8Cand one map output will wholely=
 shuffle to one reduce=E3=80=82So the other reduce will get lots of empty s=
egments.

'                             whole
'mapOutput_n  | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3--->      reduce1
'                        |    empty   =20
'                        | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3--->      re=
duce2
'                        |    empty
'                        | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3--->      re=
duce3
'                        |   empty
'                        | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3--->      re=
duce4

Because, our input data is bigger, so there are lots of map(10^5). And most=
ly there are several thousands maps to one reduce, and several thousands em=
pty segments.=20

For example:
     1000 mapOutput(on disk) + 3000 empty segments(in mem)

Then, as the io.sort.factor=3D100

    in first merge cycle, the merger will merge 10+3000 segments [ by getPa=
ssFactor (1000 - 1)%100 + 1 + 30000 ]=EF=BC=8Cbecause there is no real data=
 in mem, then we should use the left 990 mapOutput to replace the empty 300=
0 mem segments, then we open 1000 fd.

    Once there are several reduce on one taskTracker, we will open several =
thousand fds.


    I think we can use first collection to remove the empty segments, moreo=
ver in shuffle phase, we also can not add the segment into mem.

  was:
The Merger will open too many files on disk, when there are too many empty =
segments in shuffle mem.


We process larger data , eg. > 100T=EF=BC=8Cin one Job. And we use our part=
itioner to partition the map output=EF=BC=8Cand one map output will wholely=
 shuffle to one reduce=E3=80=82So the other reduce will get lots of empty s=
egments.

"                  whole
"map_n  | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3--->      reduce1
"             |    empty   =20
"             | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3--->      reduce2
"             |    empty
"             | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3--->      reduce3
"             |   empty
"             | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3--->      reduce4

Because, our input data is bigger, so there are lots of map(10^5). And most=
ly there are several thousands maps to one reduce, and several thousands em=
pty segments.=20

For example:
     1000 mapOutput(on disk) + 3000 empty segments(in mem)

Then, as the io.sort.factor=3D100

    in first merge cycle, the merger will merge 10+3000 segments [ by getPa=
ssFactor (1000 - 1)%100 + 1 + 30000 ]=EF=BC=8Cbecause there is no real data=
 in mem, then we should use the left 990 mapOutput to replace the empty 300=
0 mem segments, then we open 1000 fd.

    Once there are several reduce on one taskTracker, we will open several =
thousand fds.


    I think we can use first collection to remove the empty segments, moreo=
ver in shuffle phase, we also can not add the segment into mem.


> prevent Merger fd  leak when there are lots empty segments in mem
> -----------------------------------------------------------------
>
>                 Key: MAPREDUCE-1424
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1424
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: task
>            Reporter: Xing Shi
>
> The Merger will open too many files on disk, when there are too many empt=
y segments in shuffle mem.
> We process larger data , eg. > 100T=EF=BC=8Cin one Job. And we use our pa=
rtitioner to partition the map output=EF=BC=8Cand one map output will whole=
ly shuffle to one reduce=E3=80=82So the other reduce will get lots of empty=
 segments.
> '                             whole
> 'mapOutput_n  | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3--->      reduce1
> '                        |    empty   =20
> '                        | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3--->      =
reduce2
> '                        |    empty
> '                        | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3--->      =
reduce3
> '                        |   empty
> '                        | =EF=BF=A3=EF=BF=A3=EF=BF=A3=EF=BF=A3--->      =
reduce4
> Because, our input data is bigger, so there are lots of map(10^5). And mo=
stly there are several thousands maps to one reduce, and several thousands =
empty segments.=20
> For example:
>      1000 mapOutput(on disk) + 3000 empty segments(in mem)
> Then, as the io.sort.factor=3D100
>     in first merge cycle, the merger will merge 10+3000 segments [ by get=
PassFactor (1000 - 1)%100 + 1 + 30000 ]=EF=BC=8Cbecause there is no real da=
ta in mem, then we should use the left 990 mapOutput to replace the empty 3=
000 mem segments, then we open 1000 fd.
>     Once there are several reduce on one taskTracker, we will open severa=
l thousand fds.
>     I think we can use first collection to remove the empty segments, mor=
eover in shuffle phase, we also can not add the segment into mem.

--=20
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.