hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Samprita Hegde <samprit...@gmail.com>
Subject Decoupling Intertracker protocol
Date Wed, 22 Apr 2009 13:47:44 GMT
Hi,
   I am trying see the feasibility of using shared spaces for the
communication of the Task Completion Events in Hadoop. For this I am trying
to replace the InterTracker Protocol with a co-ordination space so that one
thread in a Task Tracker puts the MapTask Completion Events on to the space
and another Thread  receives these events sends them to the Reudce Tasks
launched by that TaskTracker. .Even the Job tracker can subscribe to these
events to make decisions regarding scheduling/ Restarting the sluggish tasks
etc ..

Currently all the information seems to be sent via the heartbeat message in
the InterTracker Protocol. Is there a way where I can decouple only some
part of heartbeat message and put it on to the space? (Especially the
TaskCompletionEvent and TaskStatus). Using this the task completion events
can be exchanged directly among the Task Trackers adn not through the Job
Tracker.
I am not sure if this strategy is good for large scale Map_Reduce
applications. But it might work well for small scale Map-reduce jobs.

More Information on the co-ordination space that I want to use can be found
here : http://www.caip.rutgers.edu/~zhljenny/comet.htm. I am still going
through Hadoop's code and trying to undersatnd the various protocol between
the processes. If you have any Good documentation regarding Hadoop's
architecture, it would be really helpful for me.

Thanks a lot in advance,
Samprita Hegde

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message