hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Nauroth <cnaur...@hortonworks.com>
Subject Re: cp command in webhdfs (and Filesystem Java Object)
Date Wed, 29 Jun 2016 15:36:01 GMT
Hello Jérôme,

WebHDFS provides an HTTP binding to the FileSystem API, which defines the primitive operations
offered by the file system.  The FileSystem Shell builds on top of the FileSystem API to provide
higher-level workflows, implemented using the FileSystem primitives.  In the case of "cp",
copy is not a primitive operation defined by the FileSystem API.  Instead, the FileSystem
Shell implements it by composing a few different FileSystem API primitives: open, create and

Due to this separation, you won't find a "cp" operation directly in the WebHDFS REST API (or
HTTPFS).  However, it is possible for the FileSystem shell to reference paths as URIs using
the "webhdfs" scheme.  For example:

> hadoop fs -cp webhdfs://localhost:9870/hello1 webhdfs://localhost:9870/hello2

> hadoop fs -cat webhdfs://localhost:9870/hello2

--Chris Nauroth

From: Jérôme BAROTIN <jerome@barotin.fr<mailto:jerome@barotin.fr>>
Date: Wednesday, June 29, 2016 at 12:44 AM
To: Rohan Rajeevan <rohan.rajeevan@gmail.com<mailto:rohan.rajeevan@gmail.com>>
Cc: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>" <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Subject: Re: cp command in webhdfs (and Filesystem Java Object)

I'm not thinking that is the same :
- CREATE is for a local file : in my case, I just want to copy one hdfs path to another on
the same cluster
- Distcp, is for copying file between two differents clusters.

I'm using HTTPFs/webhdfsREST API to acces to my cluster, and I need to execute a "cp" command.
How can I do that ?

Do I need to develop this service ?


2016-06-29 8:17 GMT+02:00 Rohan Rajeevan <rohan.rajeevan@gmail.com<mailto:rohan.rajeevan@gmail.com>>:

May be look at this? https://hadoop.apache.org/docs/r1.0.4/webhdfs.html#CREATE
If you are interested in intra cluster copy, may look at DistCp<https://hadoop.apache.org/docs/current/hadoop-distcp/DistCp.html>?

On Tue, Jun 28, 2016 at 9:36 AM, Jérôme BAROTIN <jerome@barotin.fr<mailto:jerome@barotin.fr>>

I'm writing this email, because, I spent one hour to look for a cp command in the webhdfs
API (in fact, I'm using HTTPFS, but I think it's the same).

This command is implemented in the "hdfs dfs" command line client (and I'm using this command),
but, I can't find it on the webhdfs REST API. I thought that webhdfs is an implementation
of the Filesystem object (https://hadoop.apache.org/docs/r2.6.1/api/org/apache/hadoop/fs/FileSystem.html).
I checked at the Java API and I haven't found any cp command. The only java cp command is
on the FileUtil Object (https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/fs/FileUtil.html)
and I'm not sure that it work identicaly than "hdfs dfs -cp" command.

I also checked at the Hadoop JIRA, and I found nothing : https://issues.apache.org/jira/browse/HADOOP-9417?jql=project%20%3D%20HADOOP%20AND%20(text%20~%20%22webhdfs%20copy%22%20OR%20text%20~%20%22webhdfs%20cp%22)

is there a way to execute a cp command through a REST API ?

All my best,


View raw message