hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan LeCompte <lecom...@gmail.com>
Subject Re: Reduce tasks running out of memory on small hadoop cluster
Date Sun, 21 Sep 2008 04:59:37 GMT
Yes I did, but that didn't solve my problem since I'm working with a  
fairly large data set (8gb).

Thanks,
Ryan




On Sep 21, 2008, at 12:22 AM, Sandy <snickerdoodle08@gmail.com> wrote:

> Have you increased the heapsize in conf/hadoop-env.sh to 2000? This  
> helped
> me some, but eventually I had to upgrade to a system with more memory.
>
> -SM
>
>
> On Sat, Sep 20, 2008 at 9:07 PM, Ryan LeCompte <lecompte@gmail.com>  
> wrote:
>
>> Hello all,
>>
>> I'm setting up a small 3 node hadoop cluster (1 node for
>> namenode/jobtracker and the other two for datanode/tasktracker). The
>> map tasks finish fine, but the reduce tasks are failing at about 30%
>> with an out of memory error. My guess is because the amount of data
>> that I'm crunching through just won't be able to fit in memory during
>> the reduce tasks on two machines (max of 2 reduce tasks on each
>> machine). Is this expected? If I had a large hadoop cluster, then I
>> could increase the number of reduce tasks on each machine of the
>> cluster so that not all of the data to be processed is occurring in
>> just 4 JVMs on two machines like I currently have setup, correct? Is
>> there any way to get the reduce task to not try and hold all of the
>> data in memory, or is my only option to add more nodes to the cluster
>> to therefore increase the number of reduce tasks?
>>
>> Thanks!
>>
>> Ryan
>>

Mime
View raw message