hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Jones <nick.jo...@amd.com>
Subject Re: Example for using DistributedCache class
Date Wed, 03 Feb 2010 13:57:54 GMT
Hi Udaya,
The following code uses already existing cache files as part of the map 
to process incoming data.  I apologize on the naming conventions, but 
the code had to be stripped.  I also removed several variable 
assignments, etc..

public class MySpecialJob {
   public static class MyMapper extends MapReduceBase implements 
Mapper<LongWritable, MyMapInputValueClass, MyMapOutputKeyClass, 
BigIntegerWritable> {

     private Path[] dcfiles;
     ...
		
     public void configure(JobConf job) {
       // Load cached files
       dcfiles = new Path[0];
       try {
         dcfiles = DistributedCache.getLocalCacheFiles(job);
       } catch (IOException ioe) {
         System.err.println("Caught exception while getting cached 
files: " + StringUtils.stringifyException(ioe));
       }
     }

     public void map(LongWritable key, MyMapInputValueClass value,
       OutputCollector<MyMapOutputKeyClass, BigIntegerWritable> output,
       Reporter reporter) throws IOException {
       ...
       for (Path dcfile : dcfiles) {
         if(dcfile.getName().equalsIgnoreCase(file_match)) {
           readbuffer = new BufferedReader(
             new FileReader(dcfile.toString()));
           ...
           while((raw_line = readbuffer.readLine()) != null) {
             ...

   public static void main(String[] args) throws Exception {
     JobConf conf = new JobConf(MySpecialJob.class);
     ...	
     DistributedCache.addCacheFile(new URI("/path/to/file1.txt"), conf);
     DistributedCache.addCacheFile(new URI("/path/to/file2.txt"), conf);
     DistributedCache.addCacheFile(new URI("/path/to/file3.txt"), conf);
     ...
   }
}

Nick Jones


Udaya Lakshmi wrote:
> Hi,
>    As a newbie to hadoop, I am not able to figure out how to use
> DistributedCache class. Can someone give me a small code which
> distributes file to the cluster and the show how to open and use the
> file in the map or reduce task.
> Thanks,
> Udaya
> 


Mime
View raw message