hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <ey...@yahoo-inc.com>
Subject Re: class project on Hadoop
Date Tue, 22 Feb 2011 19:00:58 GMT
Hi Nikhil,

For #2, there is some initial work done in a project called Chukwa.  Chukwa is designed to
collect hadoop metrics,  job log files, hdfs/mr client trace log files, and system metrics.
 By using those data, it is possible to reconstruct state machines of the health of the hadoop
cluster and identify faulty hardware.  If you are interested, the research work is in a jira
at:

https://issues.apache.org/jira/browse/CHUKWA-94

Jiaqi Tan has written a more detailed research paper at:

https://issues.apache.org/jira/secure/attachment/12404723/tan.pdf

There was another research project in Yahoo, which base on using hadoop metrics with one class
svm classification to identify faulty hardware by AI.  The AI approach has a lot of potential,
but it has not been published yet.  However, it is on the roadmap for Chukwa project.

Hope this is useful.

Regards,
Eric

On 2/22/11 9:28 AM, "Nikhil Panpalia" <nikhil@cs.utexas.edu> wrote:

Hello everyone,

I'm a graduate student at the University of Texas at Austin. I'm looking for
a research/implementation based project on Hadoop and I came across the list
posted on the wiki page - http://wiki.apache.org/hadoop/ProjectSuggestions.
But, this page was last updated in September, 2009. So, I'm not sure if some
of these ideas have already been implemented or not. I was particularly
interested in the following projects (listed on the wiki page):
1) Sort and Shuffle optimization in the MR framework.
2) Hadoop compatible framework for discovering network topology and
identifying and diagnosing hardware that is not functioning correctly.

Can anyone give me any details about these? Are these projects already under
progress or completed?

Thanks,
Nikhil


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message