hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun Natva <arun.na...@gmail.com>
Subject Re: How to share files amongst multiple jobs using Distributed Cache in Hadoop 2.7.2
Date Tue, 07 Jun 2016 13:06:18 GMT
If you use the Instance of Job class, you can add files to distributed cache like this:
Job job = Job.getInstanceOf(conf);
job.addCacheFiles(filepath);


Sent from my iPhone

> On Jun 7, 2016, at 5:17 AM, Siddharth Dawar <siddharthdawar17@gmail.com> wrote:
> 
> Hi,
> 
> I wrote a program which creates Map-Reduce jobs in an iterative fashion as follows:
> 
> 
> while (true) 
> {
> JobConf conf2  = new JobConf(getConf(),graphMining.class);
> 
> conf2.setJobName("sid");
> conf2.setMapperClass(mapperMiner.class);
> conf2.setReducerClass(reducerMiner.class);
> 
> conf2.setInputFormat(SequenceFileInputFormat.class);
> conf2.setOutputFormat(SequenceFileOutputFormat.class);
> conf2.setOutputValueClass(BytesWritable.class);
> 
> conf2.setMapOutputKeyClass(Text.class);
> conf2.setMapOutputValueClass(MapWritable.class);
> conf2.setOutputKeyClass(Text.class);
> 
> conf2.setNumMapTasks(Integer.parseInt(args[3]));
> conf2.setNumReduceTasks(Integer.parseInt(args[4]));
> FileInputFormat.addInputPath(conf2, new Path(input));
> FileOutputFormat.setOutputPath(conf2, new Path(output)); }
> RunningJob job = JobClient.runJob(conf2);
> }
> 
> Now, I want the first Job which gets created to write something in the distributed cache
and the jobs which get created after the first job to read from the distributed cache. 
> 
> I came to know that the DistributedCache.addcacheFiles() method is deprecated, so the
documentation suggests to use Job.addcacheFiles() method specific for each job.
> 
> But, I am unable to get an handle of the currently running job, as JobClient.runJob(conf2)
submits a job internally.
> 
> 
> How can I share the content written by the first job in this while loop available via
distributed cache to other jobs which get created in later iterations of while loop ? 
> 

Mime
View raw message