hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Koch <tho...@koch.ro>
Subject Re: what kind of improvement for HDFS could possibly be done within 3 months
Date Tue, 28 Sep 2010 10:05:25 GMT
xiaofei du:
> Hi All,
> I am a graduate student, I am preparing for my diploma project. I have
> about 3 months to finish the project. I want to do some work on HDFS.
> However, I have no concept what I could do for improving HDFS. So could you
> guys please give me some suggestions?
> I hope the suggested project could be done within 3 months, I cannot afford
> more time. So the project should not be too hard (at the time, it should
> not be easy, otherwise, I cannot reach the graduation requirement :-) )
> thank you !!!

you could write a developer documentation of the inner workings of HDFS 
(+HBASE, +MAPREDUCE?) that could be understood by HDFS users. Additionally to 
the documentation of the current state, you could include:

- Different strategies to make the NameNode distributed
- The different Approaches to append
- How does Security with Kerberos work?

One of the challenges of such a work would be to make it as easy as possible 
for developers to understand some part of HDFS they're interested in.
Another challenge is to choose a documentation format and workflow that would 
make it easy to keep this documentation current without much effort.

A totally other project that I also consider important for Hadoop: Help Apache 
to implement an infrastructure based on GIT. This could help many projects in 
the long run. If you're interested in this, you should subscribe to 
infrastructure-dev@apache.org and get in contact with Jukka Zitting 

Best regards,

Thomas Koch, http://www.koch.ro

View raw message