hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anupam Seth <anup...@yahoo-inc.com>
Subject RE: Need help understanding Hadoop Architecture
Date Mon, 24 Oct 2011 16:26:05 GMT
Hi Mike,

This might help address your question:

http://storageconference.org/2010/Papers/MSST/Shvachko.pdf

Regards,
Anupam

-----Original Message-----
From: panamamike [mailto:panamamike@hotmail.com] 
Sent: Sunday, October 23, 2011 9:59 AM
To: core-user@hadoop.apache.org
Subject: Need help understanding Hadoop Architecture


I'm new to Hadoop.  I've read a few articles and presentations which are
directed at explaining what Hadoop is, and how it works.  Currently my
understanding is Hadoop is an MPP system which leverages the use of large
block size to quickly find data.  In theory, I understand how a large block
size along with an MPP architecture as well as using what I'm understanding
to be a massive index scheme via mapreduce can be used to find data.

What I don't understand is how ,after you identify the appropriate 64MB
blocksize, do you find the data you're specifically after?  Does this mean
the CPU has to search the entire 64MB block for the data of interest?  If
so, how does Hadoop know what data from that block to retrieve?

I'm assuming the block is probably composed of one or more files.  If not,
I'm assuming the user isn't look for the entire 64MB block rather a portion
of it.

Any help indicating documentation, books, articles on the subject would be
much appreciated.

Regards,

Mike
-- 
View this message in context: http://old.nabble.com/Need-help-understanding-Hadoop-Architecture-tp32705405p32705405.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.


Mime
View raw message