hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saikat Kanjilal <sxk1...@hotmail.com>
Subject RE: MapReduce Usage in Search Engines
Date Fri, 30 Jul 2010 14:06:07 GMT

Hello Yuhendar,I'll add as much as I can at a high level from what I have learned so far about
map-reduce to answer your questions:
1)  The goal behind map-reduce is to perform a distributed computation which breaks up a large
computation intensive problem into smaller chunks and solve those individual chunks and finally
combine the result, the problem in this case being search, in this problem you have a master
node and a set of slave nodes, the master (or in the hadoop domain I believe its known as
the name node) takes input from the client in the form of a job and forwards this job out
to the slaves which go off and solve smaller pieces of the problem and return the results.
 The master then uses a combine approach to gather the results from all the slaves and present
it back to the client.   A more concrete example is the distributed grep problem which is
a form of searching for a particular word (or document) in a huge dataset.  Take a look at
the hadoop examples or the hadoop webpage to learn more about this.
2) Google to my understanding is using their internal implementation of the general algorithm
for mapreduce to store data in their datastore known as bigtable which is a multi-dimensional
sorted map.

My 2 cents.Regards.

> Date: Fri, 30 Jul 2010 11:53:49 +0530
> Subject: Re: MapReduce Usage in Search Engines
> From: yuhendar@tce.edu
> To: common-dev@hadoop.apache.org
> Hi all,
>           I have a basic query regarding Mapreduce usage in search
> engines. My queries are:
> 1.How Map-Reduce is used in search?
> 2.Is Google uses Mapreduce algorithm for its search engine? Then how they
> use it? Explain the architecture or flow of how google or other search
> engines work and what is the part of mapreduce in it.....................
>                            Please Explain.........
> With Regards,
> B.Yuhendar
> -----------------------------------------
> This email was sent using TCEMail Service.
> Thiagarajar College of Engineering
> Madurai-625 015, India
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message