hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Reducers are stuck fetching map data.
Date Tue, 26 Jan 2010 18:52:28 GMT
You mean that documentation?
http://hadoop.apache.org/common/docs/r0.20.1/quickstart.html#Required+Software

J-D

On Tue, Jan 26, 2010 at 1:34 AM, Suhail Rehman <suhailrehman@gmail.com> wrote:
> We finally figured it out! The problem was with the JDK installation on our
> VMs, it was configured to use IBM JDK, and the moment we switched to Sun,
> everything now works flawlessly.
>
> You may want to include this information somewhere in the documentation that
> you strongly recommend Sun JDK to be used with Hadoop.
>
> Suhail
>
> On Thu, Jan 21, 2010 at 1:13 PM, Suhail Rehman <suhailrehman@gmail.com>
> wrote:
>>
>> We have verified that it does NOT solve the problem at all.  This would
>> lead us to believe that the timeout issue we are experiencing is not part of
>> the shuffle phase. Any other ideas that might help us?
>>
>> The Tasktracker logs show that these reducers are stuck during the copy
>> phase.
>>
>> Suhail
>>
>> On Wed, Jan 20, 2010 at 5:22 PM, Amareshwari Sri Ramadasu
>> <amarsri@yahoo-inc.com> wrote:
>>>
>>> ReadTimeOuts are found to be costly during shuffle, if the map runtime is
>>> high.
>>> Please see HADOOP-3327( http://issues.apache.org/jira/browse/HADOOP-3327)
>>> for shuffle improvements done for ReadTimeOut specificlly
>>>
>>> Thanks
>>> Amareshwari
>>>
>>> On 1/20/10 6:07 PM, "Suhail Rehman" <suhailrehman@gmail.com> wrote:
>>>
>>> We are having trouble running Hadoop MapReduce jobs on our cluster.
>>>
>>> VMs running on an IBM blade center with the following virtualized
>>> configuration:
>>>
>>> Master Node/Namenode: 1x
>>> OS:                 Xen RedHat Linux 5.2, CPU : 3 vCPU, RAM:
1024 MB
>>> Slaves/DataNode: 3x
>>> OS:                 Xen RedHat Linux 5.2 1 vCPU, 1024 MB RAM
>>>
>>> We are working with standard Hadoop example code. We are using Hadoop
>>> 0.20.1, stable with the latest patches installed. All VMs have firewalls
>>> turned off as well as SELinux disabled.
>>>
>>> For example, while we try to execute the "wordcount" program on a
>>> provisioned cluster, the Map operations complete successfully, the program
>>> is stuck trying to complete the reduce operations.
>>>
>>> On examining the logs, we find that the Reducers are waiting for the
>>> outputs from Map operations on other nodes. Our understanding is that this
>>> communication happens over HTTP sockets and all these provisioned VMs have
>>> trouble communicating over the HTTP sockets on the ports that Hadoop uses.
>>>
>>> Also, while trying to access the JobTracker web interface to view the
>>> running jobs, we see that the machine is taking too much time to respond to
>>> our queries. Since both of the Reducer communication and the JobTracker web
>>> interface works over HTTP, we think the problem might be a networking issue
>>> or a problem with the built-in HTTP service in Hadoop (Jetty).
>>>
>>> Attached is a partial Task log from one of the Reducers,
>>> "WARN org.apache.hadoop.mapred.ReduceTask:
>>> java.net.SocketTimeoutException: Read timed out"
>>> appears on all reducers, and eventually the Job either fails to complete
>>> or takes a very long time (about 15 hours to process a 11 GB text file).
>>>
>>> This problem seems to be random and at times the program runs sucessfully
>>> in about 20 mins, othertimes it completes the operation in 15 hours.
>>>
>>> Any help with regards to this would be much appreciated.
>>>
>>> Regards,
>>>
>>> Suhail Rehman
>>> MS by Research in Computer Science
>>> International Institute of Information Technology - Hyderabad
>>> rehman@research.iiit.ac.in
>>> ---------------------------------------------------------------------
>>> http://research.iiit.ac.in/~rehman
>>>
>>
>>
>>
>> --
>> Regards,
>>
>> Suhail Rehman
>> MS by Research in Computer Science
>> International Institute of Information Technology - Hyderabad
>> rehman@research.iiit.ac.in
>> ---------------------------------------------------------------------
>> http://research.iiit.ac.in/~rehman
>
>
>
> --
> Regards,
>
> Suhail Rehman
> MS by Research in Computer Science
> International Institute of Information Technology - Hyderabad
> rehman@research.iiit.ac.in
> ---------------------------------------------------------------------
> http://research.iiit.ac.in/~rehman
>

Mime
View raw message