hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Stewart <robstewar...@googlemail.com>
Subject Re: Slow final few reducers
Date Sat, 11 Dec 2010 14:11:29 GMT
Sorry my fault - It's someone running a network simulator on the cluster !

Rob

On 11 December 2010 14:09, Rob Stewart <robstewart57@googlemail.com> wrote:
> OK, slight update:
>
> Immediately underneath public void reduce(), I have added a:
> System.out.println("Key: " + key.toString());
>
> And I am logged on a node that is still working on a reducer. However,
> it stopped printing "Key:" long ago, so it is not processing new keys.
>
> But looking more closely at "top" on this node, there are *two* linux
> processes going at 100% CPU. The first is java, which, using "jps -l"
> I realize is "Child", but the second is a process called "setdest",
> which I strongly suspect has to do with my Hadoop job.
>
> What is "setdest", and what is it actually doing? And why is it taking so long?
>
> cheers,
>
> Rob Stewart
>
>
>
> On 11 December 2010 12:26, Harsh J <qwertymaniac@gmail.com> wrote:
>> On Sat, Dec 11, 2010 at 5:25 PM, Rob Stewart
>> <robstewart57@googlemail.com> wrote:
>>> Oh,
>>>
>>> I should add, of the Java processes running on the remaining nodes for
>>> the final wave of reducers, the one taking all the CPU is the "Child"
>>> process (not TaskTracker). I log into the Master, and also, the Java
>>> process taking all the CPU is "Child".
>>>
>>> Is this normal?
>>
>> Yes, "Child" is the Task JVM.
>>
>>>
>>> thanks,
>>> Rob
>>>
>>> On 11 December 2010 11:38, Rob Stewart <robstewart57@googlemail.com> wrote:
>>>> Hi, many thanks for your response.
>>>>
>>>> A few observations:
>>>> - I know that for a fact my key distribution is quite radically skewed
>>>> (some keys with *many* value, most keys with few).
>>>> - I have overlooked the fact that I need a partitioner. I suspect that
>>>> this will help dramatically.
>>>>
>>>> I realize that the number of partitions should equal the number of
>>>> reducers (e.g. 100).
>>>>
>>>> So if here are my <key>,<values> (where values is a count):
>>>> <the>,<500>
>>>> <a>,<1000>
>>>> <the cat>,<20>
>>>> <the cat sat on the mat>,<1>
>>>>
>>>> and I have 3 reducers, how do I make:
>>>> Reducer-1: <the>
>>>> Reducer-2: <a>
>>>> Reducer-3: <the cat> & <the cat sat on the mat>
>>>>
>>>>
>>>> thanks,
>>>>
>>>> Rob
>>>>
>>>> On 11 December 2010 11:12, Harsh J <qwertymaniac@gmail.com> wrote:
>>>>> Hi,
>>>>>
>>>>> Certain reducers may receive a higher share of data than others
>>>>> (Depending on your data/key distribution, the partition function,
>>>>> etc.). Compare the longer reduce tasks' counters with the quicker
>>>>> ones.
>>>>>
>>>>> Are you sure that the reducers that take long are definitely the last
>>>>> wave, as in with IDs of 180-200 (and not a random bunch of reduce
>>>>> tasks taking longer)?
>>>>>
>>>>> Also take a look at the logs, and the machines that run these
>>>>> particular reducers -- ensure nothing is wrong on them.
>>>>>
>>>>> There's nothing specifically written in Hadoop for the "last wave" of
>>>>> Reduce tasks to take longer. Each reducer writes to its own file, and
>>>>> is completely independent.
>>>>>
>>>>> --
>>>>> Harsh J
>>>>> www.harshj.com
>>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Harsh J
>> www.harshj.com
>>
>

Mime
View raw message