hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley" <omal...@apache.org>
Subject Re: Hadoop 101
Date Mon, 01 Sep 2008 16:44:57 GMT
On Mon, Sep 1, 2008 at 2:27 AM, HHB <hubaghdadi@yahoo.ca> wrote:

> Hey,
> I'm reading about Hadoop lately but I'm unable to understand it.
> Would you please explain it to me in easy words?

Let me try. Hadoop is a framework that lets you write programs that work
with very large datasets in reasonable amounts of time using "normal"
computers instead of fancy servers.

It does this by using large numbers of computers (from 4 up to 3,000 or so)
to both store the data and process the data. When you write programs that
run on large number of computers, one of the primary requirements is that it
handles failures automatically, because you'll be losing a handful a day.
Hadoop handles the failures automatically for you both in terms of storage
on the local disks and computation.

Also take a look at the
it. Google also has a lot video
lectures <http://code.google.com/edu/parallel/index.html> about it.

> How to know if I can employ Hadoop in my current company?

The primary way that companies use Hadoop is to process large sets of log
data. Yahoo collects terabytes a day of user behaviors and wants to
understand them and uses Hadoop to do it. It also has lots of other uses.
See the powered by <http://wiki.apache.org/hadoop/PoweredBy> Hadoop page for
examples of what users are doing with it.

-- Owen

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message