hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yang <teddyyyy...@gmail.com>
Subject Re: reducer out of memory?
Date Thu, 10 May 2012 18:50:19 GMT
thanks, let me run more of this with the settings provided later in
this thread and provide the details

On Wed, May 9, 2012 at 10:12 PM, Harsh J <harsh@cloudera.com> wrote:
> Can you share your job details (or a sample reducer code) and also
> share your exact error?
>
> If you are holding reducer provided values/keys in memory in your
> implementation, it can easily cause an OOME if not handled properly.
> The reducer by itself does read the values off a sorted file on the
> disk and doesn't cache the whole group in memory.
>
> On Thu, May 10, 2012 at 12:20 AM, Yang <teddyyyy123@gmail.com> wrote:
>> it seems that if I put too many records into the same mapper output
>> key, all these records are grouped into one key one one reducer,
>>
>> then the reducer became out of memory.
>>
>>
>> but the reducer interface is:
>>
>>       public void reduce(K key, Iterator<V> values,
>>                          OutputCollector<K, V> output,
>>                          Reporter reporter)
>>
>>
>> so  all the values belonging to the key can be iterated, so
>> theoretically they can be iterated from disk, and does not have to be
>> in memory at the same time,
>> so why am I getting out of heap error? is there some param I could
>> tune (apart from -Xmx since my box is ultimately bounded in memory
>> capacity)
>>
>> thanks
>> Yang
>
>
>
> --
> Harsh J

Mime
View raw message