hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dennis Kubes <nutch-...@dragonflymc.com>
Subject Re: Out of Memory during Sorts
Date Sun, 11 Jun 2006 19:25:29 GMT
Okay, I changed io.sort.factor to 100 and now it worked.  Anybody have 
any idea why?  I also restarted via stop-all and start-all.  Maybe 
memory was released?

Dennis

Dennis Kubes wrote:
> Can someone lead me in the right direction as to configuring settings 
> for large sorting operations > 1M rows.  I keep getting out of memory 
> exceptions during the sort phase.  Here are my current settings.  I 
> have 2G heap space on each box.
>
> Dennis
>
> <property>
>  <name>io.sort.factor</name>
>  <value>20</value>
>  <description>
>  The number of streams to merge at once while sorting
>  files.  This determines the number of open file handles.
>  </description>
> </property>
>
> <property>
>  <name>io.sort.mb</name>
>  <value>200</value>
>  <description>
>  The total amount of buffer memory to use while sorting
>  files, in megabytes.  By default, gives each merge stream 1MB, which
>  should minimize seeks.
>  </description>
> </property>
>
> <property>
>  <name>io.file.buffer.size</name>
>  <value>8192</value>
>  <description>
>  The size of buffer for use in sequence files.
>  The size of this buffer should probably be a multiple of hardware
>  page size (4096 on Intel x86), and it determines how much data is
>  buffered during read and write operations.
>  </description>
> </property>
>
> <property>
>  <name>io.bytes.per.checksum</name>
>  <value>4096</value>
>  <description>
>  The number of bytes per checksum.  Must not be larger than
>  io.file.buffer.size.
>  </description>
> </property>
>

Mime
View raw message