Return-Path: Delivered-To: apmail-hadoop-common-dev-archive@www.apache.org Received: (qmail 17741 invoked from network); 30 Jul 2010 14:08:05 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 30 Jul 2010 14:08:05 -0000 Received: (qmail 46529 invoked by uid 500); 30 Jul 2010 14:08:04 -0000 Delivered-To: apmail-hadoop-common-dev-archive@hadoop.apache.org Received: (qmail 46290 invoked by uid 500); 30 Jul 2010 14:08:01 -0000 Mailing-List: contact common-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-dev@hadoop.apache.org Delivered-To: mailing list common-dev@hadoop.apache.org Received: (qmail 46282 invoked by uid 99); 30 Jul 2010 14:08:00 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Jul 2010 14:08:00 +0000 X-ASF-Spam-Status: No, hits=4.4 required=10.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of sxk1969@hotmail.com designates 65.54.190.227 as permitted sender) Received: from [65.54.190.227] (HELO bay0-omc4-s25.bay0.hotmail.com) (65.54.190.227) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Jul 2010 14:07:54 +0000 Received: from BAY127-W15 ([65.54.190.199]) by bay0-omc4-s25.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Fri, 30 Jul 2010 07:06:08 -0700 Message-ID: Content-Type: multipart/alternative; boundary="_e7e684f5-90b7-4ffc-a7c3-8d0ea6c7dd24_" X-Originating-IP: [76.121.139.110] From: Saikat Kanjilal To: Subject: RE: MapReduce Usage in Search Engines Date: Fri, 30 Jul 2010 07:06:07 -0700 Importance: Normal In-Reply-To: <3c0c9c3532a41cb6b5a548f7bf23292b.squirrel@mail.tce.edu> References: <3c0c9c3532a41cb6b5a548f7bf23292b.squirrel@mail.tce.edu> MIME-Version: 1.0 X-OriginalArrivalTime: 30 Jul 2010 14:06:08.0021 (UTC) FILETIME=[5B46BC50:01CB2FF0] --_e7e684f5-90b7-4ffc-a7c3-8d0ea6c7dd24_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hello Yuhendar=2CI'll add as much as I can at a high level from what I have= learned so far about map-reduce to answer your questions: 1) The goal behind map-reduce is to perform a distributed computation whic= h breaks up a large computation intensive problem into smaller chunks and s= olve those individual chunks and finally combine the result=2C the problem = in this case being search=2C in this problem you have a master node and a s= et of slave nodes=2C the master (or in the hadoop domain I believe its know= n as the name node) takes input from the client in the form of a job and fo= rwards this job out to the slaves which go off and solve smaller pieces of = the problem and return the results. The master then uses a combine approac= h to gather the results from all the slaves and present it back to the clie= nt. A more concrete example is the distributed grep problem which is a fo= rm of searching for a particular word (or document) in a huge dataset. Tak= e a look at the hadoop examples or the hadoop webpage to learn more about t= his. 2) Google to my understanding is using their internal implementation of the= general algorithm for mapreduce to store data in their datastore known as = bigtable which is a multi-dimensional sorted map. My 2 cents.Regards. > Date: Fri=2C 30 Jul 2010 11:53:49 +0530 > Subject: Re: MapReduce Usage in Search Engines > From: yuhendar@tce.edu > To: common-dev@hadoop.apache.org >=20 > Hi all=2C > I have a basic query regarding Mapreduce usage in search > engines. My queries are: >=20 > 1.How Map-Reduce is used in search? > 2.Is Google uses Mapreduce algorithm for its search engine? Then how they > use it? Explain the architecture or flow of how google or other search > engines work and what is the part of mapreduce in it..................... >=20 > Please Explain......... >=20 > With Regards=2C > B.Yuhendar >=20 >=20 > ----------------------------------------- > This email was sent using TCEMail Service. > Thiagarajar College of Engineering > Madurai-625 015=2C India >=20 = --_e7e684f5-90b7-4ffc-a7c3-8d0ea6c7dd24_--