hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Duxbury <br...@rapleaf.com>
Subject Re: HDFS files
Date Wed, 09 Jul 2008 15:26:17 GMT
Unfortunately you have to use FSDataInputStream, not FileInputStream,  
to interact with HDFS files. Does this image processing code have a  
constructor that accepts InputStreams? If so, just pass that.

-Bryan

On Jul 8, 2008, at 10:48 AM, Kayla Jay wrote:

> Hi
>
> I am using code for a reader that must pass in a filename in order  
> to create a FileInputStream instance that uses the getChannel to  
> read the file.  I have to use FileInputStream because it is  
> processing image files and it's faster than InputStream.
>
> I can run this code locally, but when I move my job to use on files  
> that are in HDFS, it fails.  I'm assuming because the file is  
> within HDFS, it's not being recognized by the JVM to process when  
> it takes the filename in the FileInputStream.  I get a  
> nullPointerException whenever it gets the FileInputStream.
>
> In my custom reader, I have
>
> Reader(config job, filesplit split)
> {
>
> FileSystem fs = file.getFileSystem(job);
> imagereader = new ImageReader(split.getPath().toString());
>
> ....
>
> }
>
> When I print out split.getPath.toString, it produces the exact file  
> location on the HDFS that I want:
>
> hdfs://myip:port//usr/hadoop/Image1
>
> Is there a way to be able to pass in a filename for FileInputStream  
> of a file that is in HDFS?
>
> I thought since this is wrapped in a job in hadoop, the JVM would  
> take care and know where to find that file to open up regardless.
> What's my problem?  Looking at the API for FileInputStream, I have  
> no choice to take in string of the filename/path in order to  
> successfully create the FileInputStream to process with it's  
> getChannel method.
>
> I was trying to look for a way where I could just create an  
> FSDataInputStream with the open() on the fs.open(split.getPath())  
> and then pass it the inputstream. But, the API does not allow  
> InputStream to be used for the FileInputStream.
>
>
> Thanks.
>
>
>


Mime
View raw message