hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sofia Georgiakaki <geosofie_...@yahoo.com>
Subject Re: Is it possible to access the HDFS via Java OUTSIDE the Cluster?
Date Mon, 05 Sep 2011 15:42:44 GMT
Good evening,

this topic seems very interesting.
To be sure I understood the case - do you mean that I can write a simple Java program and
access a file stored in HDFS from within the java application?

Assuming that I have e.g. 10 files of size 30GB each stored on HDFS on a cluster of 15 nodes,
how can I run a java program that accesses these files and reads some blocks from them? Is
it possible to do it without copying the files via -copyToLocal ?

If yes, could anyone give some general directions on the general form of such a java code,
and on how to run such a program?

Thank  you in advance
Sofia





________________________________
From: Uma Maheswara Rao G 72686 <maheswara@huawei.com>
To: common-user@hadoop.apache.org
Sent: Monday, September 5, 2011 6:04 PM
Subject: Re: Is it possible to access the HDFS via Java OUTSIDE the Cluster?

Hi,

It is very much possible. Infact that is the main use case for Hadoop :-)

You need to put the hadoop-hdfs*.jar hdoop-common*.jar's in your class path from where you
want to run the client program.

At client node side use the below sample code

Configuration conf=new Configuration(); //you can set the required  configurations here
FileSystem fs =new DistributedFileSystem();
fs.initialize(new URI(<Name_Node_URL>), conf); 

fs.copyToLocal(srcPath, destPath)
fs.copyFromLocal(srcPath,destPath)
.....etc
There are many API exposed in FileSystem.java class. So, you can make use of them.


Regards,
Uma


----- Original Message -----
From: Ralf Heyde <ralf.heyde@gmx.de>
Date: Monday, September 5, 2011 7:59 pm
Subject: Is it possible to access the HDFS via Java OUTSIDE the Cluster?
To: common-user@hadoop.apache.org

> Hello,
> 
> 
> 
> I have found a HDFSClient which shows me, how to access my HDFS 
> from inside
> the cluster (i.e. running on a Node). 
> 
> 
> 
> My Idea is, that different processes may write 64M Chunks to HDFS from
> external Sources/Clients.
> 
> Is that possible? 
> 
> How that can be done? Does anybody have some Example Code?
> 
> 
> 
> Thanks,
> 
> 
> 
> Ralf
> 
> 
> 
> 
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message