hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Srigurunath Chakravarthi <srig...@yahoo-inc.com>
Subject RE: In which configuration file to configure the "fs.inmemory.size.mb" parameter?
Date Thu, 01 Jul 2010 09:17:51 GMT
Carp,
 IMHO, .20.x has it. fs.inmemory.size.mb is the reduce-side equivalent of io.sort.mb. In the
reducer tasks, intermediate map output is collected into a buffer (who size is governed by
this parameter's value), and data is flushed into files as (partially) sorted KVs. 

 These files will be re-merged if we end up with more than io.sort.factor number of files,
else KVs will be served out of these files to the reduce function directly.

 I don't know where in the code it is though, sorry.

cheers,
Sriguru


>-----Original Message-----
>From: Yu Li [mailto:carp84@gmail.com]
>Sent: Thursday, July 01, 2010 1:12 PM
>To: common-user@hadoop.apache.org
>Subject: In which configuration file to configure the
>"fs.inmemory.size.mb" parameter?
>
>Hi all,
>
>I looked through the "Cluster Setup" guide under link
>http://hadoop.apache.org/common/docs/r0.20.1/cluster_setup.html and
>found there's a "fs.inmemory.size.mb" parameter for specifying memory
>allocated for the in-memory file-system used to merge map-outputs at
>the reduces, and this parameter is set in the "core-site.xml". But
>when I checked the "core-default.xml" under path
>"$HADOOP_HOME/src/core/", I didn't find the parameter at all, nor
>could I find the parameter through JTUI after lauching jobs.
>Does anybody know about this parameter? Has it been removed from
>release 0.20.X? If it hasn't been removed, how could I set the
>parameter besides using the -D option? Thanks in advance.
>
>Best Regards,
>Carp

Mime
View raw message