hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhijit Sarkar <abhijit.sar...@gmail.com>
Subject RE: FileNotFoundException trying to uncompress local cache archive
Date Mon, 12 Aug 2013 03:28:47 GMT
Can someone please advise?

> From: abhijit.sarcar@gmail.com
> To: user@hadoop.apache.org
> Subject: FileNotFoundException trying to uncompress local cache archive
> Date: Sun, 11 Aug 2013 11:43:02 -0400
> 
> Hi,
> As a learning exercise for myself, I'm receiving a simple text file URI as an argument,
compressing it using GzipCodec and placing it in the Distributed Cache. In the Reducer, I'm
retrieving the archive, uncompressing it and process the text file. Well, at least that's
the idea. My uncompression code is unable to find the local cache archive and throws FileNotFoundException.

> I'm not using any GenericOptionsParser features like -copyFromLocal and trying to keep
it all in the code.
> 
> Driver:
> public int run(String[] args) throws Exception {
> Configuration conf = getConf();
> 
> final URI compressedFileURI = compressFile(new Path(args[2]).toUri(), "gzip", conf);
//implementation later
> 
> DistributedCache.addCacheArchive(compressedFileURI, conf);
> 
> Reducer:
> final Path[] cacheFiles = DistributedCache.getLocalCacheArchives(conf);
> 
> // some sanity check code
> cacheFileURI = uncompressFile(cacheFiles[0].toUri(), conf); //implementation later
> 
> Utility:
> public static URI compressFile(final URI uncompressedURI,
> 		final String codecName, final Configuration conf)
> 		throws IOException {
>         final FileSystem fs = FileSystem.get(conf);
> 	final CompressionCodec codec = new GzipCodec();
> 	final Path uncompressedPath = new Path(uncompressedURI);
> 
> 	String archiveName = addExtension(uncompressedPath.getName(),
> 			codec.getDefaultExtension(), true);
> 
> 	final Path archivePath = new Path(uncompressedPath.getParent(),
> 			archiveName);
> 
> 	final OutputStream outputStream = new FileOutputStream(archivePath
> 			.toUri().getPath());
> 	final InputStream inputStream = new FileInputStream(
> 			uncompressedURI.getPath());
> 	final CompressionOutputStream out = codec
> 			.createOutputStream(outputStream);
> 	org.apache.hadoop.io.IOUtils.copyBytes(inputStream, out, conf, false);
>         // clean up
> 
> public static URI uncompressFile(final URI archiveURI,
> 		final Configuration conf) throws IOException {
> 	final Path archivePath = new Path(archiveURI);
> 
> 	final FileSystem fs = FileSystem.get(conf);
> 
> 	final CompressionCodec codec = new CompressionCodecFactory(conf)
> 			.getCodec(archivePath);
> 	final Path uncompressedPath = new Path(
> 			CompressionCodecFactory.removeSuffix(archiveURI.getPath(),
> 					codec.getDefaultExtension()));
> 	
> 	final OutputStream outputStream = fs.create(uncompressedPath);
> 
> 	//FileNotFoundException
>         final InputStream inputStream = new FileInputStream(
> 			archiveURI.getPath());
> 
> Regards,
> Abhijit 		 	   		  
 		 	   		  
Mime
View raw message