hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marko Dinic <marko.di...@nissatech.com>
Subject Re: Reading a sequence file from distributed cache
Date Tue, 12 May 2015 10:58:34 GMT
Hello,

I have used getCacheFiles() instead of getLocalCacheFiles() and now it 
works.

Can someone please explain the difference between the two? I'm not able 
to find some good explanation about it to understand how it works.

Thanks,
Marko

On 05/11/2015 11:25 PM, marko.dinic@nissatech.com wrote:
>
> Hello,
>
> I'm new to Hadoop and I'm having a problem reading from a sequence 
> file that I add to distributed cache.
>
> I didn't have problems when I ran it in standalone mode, but now in 
> pseudo-distributed and distributed I do.
>
> I'm adding file to distributed cache like this
>
> |DistributedCache.addCacheFile(new URI(currentMedoids), conf);|
>
> And reading from it in mapper's setup method
>
> |         Configuration conf = context.getConfiguration();
>          FileSystem fs = FileSystem.get(conf);
>
>          Path[] paths = DistributedCache.getLocalCacheFiles(conf);
>
>          List<Element> sketch = new ArrayList<Element>();
>
>          SequenceFile.Reader medoidsReader = new SequenceFile.Reader(fs, paths[0], conf);
>
>          Writable medoidKey = (Writable) medoidsReader.getKeyClass().newInstance();
>          Writable medoidValue = (Writable) medoidsReader.getValueClass().newInstance();
>
>          while(medoidsReader.next(medoidKey, medoidValue)){
>
>              ElementWritable medoidWritable = (ElementWritable)medoidValue;
>              sketch.add(medoidWritable.getElement());
>          }|
>
> And I'm getting FileNotFoundException.
>
> Can anyone please help me and explain to me what is the problem and 
> how to do this properly?
>
> Thanks
>
> Sent with inky <http://inky.com?kme=signature>
>


Mime
View raw message