hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: MapReduce Usage in Search Engines
Date Sat, 31 Jul 2010 03:12:55 GMT
MapReduce tends to be used for massive (re)indexing. 
 See http://search-lucene.com/?q=hadoop+mapreduce&fc_project=Solr&fc_project=Lucene
 for how Lucene/Solr people are using MapReduce.

For example, in a recent project we used MapReduce (streaming with jruby, 
actually) together with Solr (Embedded version, to be more precise) to speed up 
indexing of a 20 GB index that used to take a couple of hours.  Now it takes 7 
minutes, because it's parallelized to Nth degree.

MapReduce can also be used for various Machine Learning data crunching, say for 
query log analysis, for content analysis, for NLP, for building of better 
relevance models for search, etc. etc.  See http://mahout.apache.org .

----Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/

----- Original Message ----
> From: "yuhendar@tce.edu" <yuhendar@tce.edu>
> To: common-dev@hadoop.apache.org
> Sent: Fri, July 30, 2010 2:23:49 AM
> Subject: Re: MapReduce Usage in Search Engines
> Hi all,
>           I have a basic query regarding  Mapreduce usage in search
> engines. My queries are:
> 1.How Map-Reduce is  used in search?
> 2.Is Google uses Mapreduce algorithm for its search engine?  Then how they
> use it? Explain the architecture or flow of how google or other  search
> engines work and what is the part of mapreduce in  it.....................
>                             Please Explain.........
> With  Regards,
> B.Yuhendar
> -----------------------------------------
> This  email was sent using TCEMail Service.
> Thiagarajar College of  Engineering
> Madurai-625 015, India

View raw message