hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Runping Qi" <runp...@yahoo-inc.com>
Subject RE: Global information in MapReduce
Date Mon, 19 Mar 2007 16:32:01 GMT

If the word set is small (< 100), it should be OK to stuff them in the
jobConf. 



> -----Original Message-----
> From: Ilya Vishnevsky [mailto:Ilya.Vishnevsky@e-legion.com]
> Sent: Monday, March 19, 2007 9:25 AM
> To: hadoop-user@lucene.apache.org
> Subject: RE: Global information in MapReduce
> 
> 
> Thanks, that's a good idea. As I understand, file name will be passed
> using set() or setObject() methods of JobConf. Am I right? But what if
> I'll try to use JobConf to pass the whole list of words to the mapper?
> Is it possible?
> 
> 
> 
> One way to do that is to store your words in a DFS file.
> In the configure method of your mapper class, you can read the words in
> from
> the file and use them. You can use JobConf to pass the file name to the
> mapper.
> 
> Runping
> 
> 
> > -----Original Message-----
> > From: Ilya Vishnevsky [mailto:Ilya.Vishnevsky@e-legion.com]
> > Sent: Monday, March 19, 2007 8:13 AM
> > To: hadoop-user@lucene.apache.org
> > Subject: Global information in MapReduce
> >
> > Hello! My question is about mapreduce. Is it possible to pass to the
> map
> > function some global information? For example I have a set of words
> and
> > a large set of documents. I want the map function to get each document
> > as value and emit pairs (word-frequency) for each word in the set,
> where
> > "frequency" is frequency of this word in the document. To do this I
> need
> > map function to have access to the set of words each time it runs. Is
> it
> > possible to do that?



Mime
View raw message