hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Patterson, Josh" <jpatters...@tva.gov>
Subject RE: Project ideas !
Date Wed, 14 Oct 2009 15:15:42 GMT
Siddu,
If this is for an undergraduate class, I would suggest something that
allows you to get some work in with basic data structures such as
building an inverted index over a few million documents (maybe Wikipedia
pages?). You will also need to get a general feel for Hadoop.

The University of Washington has some really nice project ideas for
their distributed systems class:

http://www.cs.washington.edu/education/courses/cse490h/09wi/projects/490
H.project.ideas.pdf

If you wanted to tackle something a little more advanced, then you could
take a look at Pete Skomoroch's article on finding trends with Hadoop
and Hive:

http://www.cloudera.com/blog/2009/07/31/tracking-trends-with-hadoop-and-
hive-on-ec2/

http://www.cloudera.com/blog/2009/09/28/grouping-related-trends-with-had
oop-and-hive/

Things to keep in mind:

1.) Hadoop wont be as simple as writing a single Java app
2.) There will be some overhead involved in re-writing algorithms in Map
Reduce
3.) There will also be some overhead involved in setup and maintenance
of the Hadoop Cluster

Take these three things into account when planning how to manage your
time for the project during the semester, semesters can seem a lot
shorter when you spend too much time on things not related to just
implementing and testing your algorithm.

Good luck!

Josh Patterson
TVA



-----Original Message-----
From: Siddu [mailto:siddu.sjce@gmail.com] 
Sent: Wednesday, October 14, 2009 6:09 AM
To: common-user@hadoop.apache.org
Cc: core-user@hadoop.apache.org
Subject: Project ideas !

Hello Hadoop Users,

Me and another friend of mine are looking out for some of the project
ideas
based on hadoop

as a part of our  curriculum .


Can you give us some pointers please


Thanks in advance !

Regards,
~Sid~

Mime
View raw message