hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fernando Doglio <fernando.dog...@moove-it.com>
Subject Re: Looking for a place to start
Date Fri, 09 Mar 2012 12:14:56 GMT
Thank you, everyone, I'll try to take a look at the book suggested by Dave
and Alexander... you're right, I was just looking for something else aside
from logs, but it could be a good start, you're right :)

Thanks again!

On Wed, Mar 7, 2012 at 6:52 PM, Gauthier, Alexander <
Alex.Gauthier@teradata.com> wrote:

> Sounds like you're looking for a "problem" to solve, you mentioned being a
> "web developer" how about loading some web logs and try to do some
> sessionization analysis? There are plenty of map-reduce functions out
> there; doing just that (with minor modification to conform to your log
> format).... that would be a good place to start thinking in term of "big
> data" :)
> HTH.
> -----Original Message-----
> From: Fernando Doglio [mailto:fernando.doglio@moove-it.com]
> Sent: Tuesday, March 06, 2012 5:21 AM
> To: common-dev@hadoop.apache.org
> Subject: Looking for a place to start
> Hello everyone, this is my first mail to list.
> My question has probably been answered before, but I couldn't find a way
> to search through the archives so.. here it goes:
> I've been toying around with Hadoop for a few weeks now, I've installed
> Cloudera's VM, tried some of the examples, wrote the classic word count
> example (seems like it's the "hello world" of Hadoop :P)  using streaming
> and now I'm looking for a  bigger challenge.
> My main purpose of these tests is to train myself to think in "big data"
> terms, instead of the classic approach a web developer takes when dealing
> with information.
> So, taking all this into account, what would you recommend I try next?
> I've been looking for a big source of data to work with, something to get
> information out of. I know I could generate it myself, but I was hoping
> that something like that would already exists somewhere.
> What where your next steps when starting out with this tech?
> Thanks in advance!
> Fernando

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message