hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Shine <Dave.Sh...@channelintelligence.com>
Subject RE: Looking for a place to start
Date Tue, 06 Mar 2012 13:37:09 GMT
I suggest you get a copy of Hadoop, The Definitive Guide by Tom White.  I found it very informative
when I was first starting out.

As for sample "big data" , the book uses weather data from the NCDC.  You can download it
from https://github.com/tomwhite/hadoop-book/tree/master/input/ncdc/all


-----Original Message-----
From: Fernando Doglio [mailto:fernando.doglio@moove-it.com]
Sent: Tuesday, March 06, 2012 8:21 AM
To: common-dev@hadoop.apache.org
Subject: Looking for a place to start

Hello everyone, this is my first mail to list.

My question has probably been answered before, but I couldn't find a way to search through
the archives so.. here it goes:

I've been toying around with Hadoop for a few weeks now, I've installed Cloudera's VM, tried
some of the examples, wrote the classic word count example (seems like it's the "hello world"
of Hadoop :P)  using streaming and now I'm looking for a  bigger challenge.

My main purpose of these tests is to train myself to think in "big data"
terms, instead of the classic approach a web developer takes when dealing with information.

So, taking all this into account, what would you recommend I try next? I've been looking for
a big source of data to work with, something to get information out of. I know I could generate
it myself, but I was hoping that something like that would already exists somewhere.

What where your next steps when starting out with this tech?

Thanks in advance!


The information contained in this email message is considered confidential and proprietary
to the sender and is intended solely for review and use by the named recipient. Any unauthorized
review, use or distribution is strictly prohibited. If you have received this message in error,
please advise the sender by reply email and delete the message.

View raw message