hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcos Ortiz <mlor...@uci.cu>
Subject Re: Lost Task Tracker because of no heartbeat
Date Wed, 16 Mar 2011 19:12:40 GMT
On Wed, 2011-03-16 at 17:50 +0100, baran cakici wrote:
> Hi Everyone,
> 
> I make a Project with Hadoop-MapRedeuce for my master-Thesis. I have a
> strange problem on my System.
> 
> First of all, I use Hadoop-0.20.2 on Windows XP Pro with Eclipse
> Plug-In. When I start a job with big Input(4GB - it`s may be not to
> big, but algorithm require some time), then i lose my Task Tracker in
> several minutes or seconds. I mean, "Seconds since heartbeat"
> increase 
> and then after 600 Seconds I lose TaskTracker.  
> 
> I read somewhere, that can be occured because of small number of open
> files (ulimit -n). I try to increase this value, but i can write as
> max value in Cygwin 3200.(ulimit -n 3200) and default value is 256.
> Actually I don`t know, is it helps or not.
> 
> In my job and task tracker.log have I some Errors, I posted those to.
> 
> Jobtracker.log
> 
> -Call to localhost/127.0.0.1:9000 failed on local exception:
> java.io.IOException: An existing connection was forcibly closed by the
> remote host
> 
> another :
> -
> 2011-03-15 12:13:30,718 INFO org.apache.hadoop.mapred.JobTracker:
> attempt_201103151143_0002_m_000091_0 is 97125 ms debug.
> 2011-03-15 12:16:50,718 INFO org.apache.hadoop.mapred.JobTracker:
> attempt_201103151143_0002_m_000091_0 is 297125 ms debug.
> 2011-03-15 12:20:10,718 INFO org.apache.hadoop.mapred.JobTracker:
> attempt_201103151143_0002_m_000091_0 is 497125 ms debug.
> 2011-03-15 12:23:30,718 INFO org.apache.hadoop.mapred.JobTracker:
> attempt_201103151143_0002_m_000091_0 is 697125 ms debug.
> 
> Error launching task
> Lost tracker 'tracker_apple:localhost/127.0.0.1:2654'
> 
> there are my logs(jobtracker.log, tasktracker.log ...) in attachment 
> 
> I need really Help, I don`t have so much time for my Thessis.
> 
> Thanks a lot for your Helps,
> 
> Baran 

Regards, Baran 
I was analyzing your logs and I have several questions:
1- On the hadoop-Baran-jobtracker-apple.log you have this:
Cleaning up the system directory
2011-03-15 01:18:44,468 INFO org.apache.hadoop.mapred.JobTracker:
problem cleaning system directory:
hdfs://localhost:9000/cygwin/usr/local/hadoop-datastore/hadoop-Baran/mapred/system
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot
delete /cygwin/usr/local/hadoop-datastore/hadoop-Baran/mapred/system.
Name node is in safe mode.

This is a notice that you are doing something wrong with HDFS.
Can you provide the output of:
     hadoop dfsadmin -report 
on the NameNode?

Regards

-- 
 Marcos Luís Ortíz Valmaseda
 Software Engineer
 Centro de Tecnologías de Gestión de Datos (DATEC)
 Universidad de las Ciencias Informáticas
 http://uncubanitolinuxero.blogspot.com
 http://www.linkedin.com/in/marcosluis2186



Mime
View raw message