hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Geoffrey Gallaway <geof...@geoffeg.org>
Subject Developing Hadoop and HDFS
Date Wed, 30 Sep 2009 00:39:31 GMT

Yes, another person looking to contribute to and develop Hadoop. I'm looking
to start off small, fixing a few bugs before moving into larger stuff.

First, a bit of background:
Years ago I had the idea of creating a semi-decentralized distributed file
system. The idea came when I was working for a small/medium sized company
who was looking for a simple backup solution for their workstations. PC's
back then came with 100+ GB hard drives but, as simple workstations,
employees were using less than half that space. Why not have each
workstation backup to a few other workstations, duplicating files across
multiple machines for redundancy. RAID for the network. I started coming up
with design and architecture specs, protocol examples and even started
writing a bit of the system (in Java). I tried to find a few interested
developers but everyone seemed to think the task was much too large to be
accomplished as a side project (and I didn't think, given the IT industry of
the time, that anyone would fund it). Later, I realized such a distributed
system could be much more than a simple file backup solution.

It looks like Hadoop and HDFS are creating a lot of what I had wanted to
create, it's already surpassed what I had in mind in most ways.

So, where should I start? Just start fixing bugs listed in JIRA?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message