cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: hadoop distcp from brisk cluster to hadoop cluster
Date Sat, 11 Feb 2012 13:59:49 GMT
It mostly works as normal with one caveat.
http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/possibly_the_worlds_first_briskcp

In the other direction hadoop may not know how to "talk" to cfs:/// without
having to install extra stuff. So this is where htfp:// comes in...

Copying between versions of HDFS

For copying between two different versions of Hadoop, one will usually use
HftpFileSystem. This is a read-only FileSystem, so DistCp must be run on
the destination cluster (more specifically, on TaskTrackers that can write
to the destination cluster). Each source is specified as
hftp://<dfs.http.address>/<path> (the default dfs.http.address is
<namenode>:50070).
Also distcp can push or pull data so usually you have a few options.

On Fri, Feb 10, 2012 at 2:56 PM, rk vishu <talk2hadoop@gmail.com> wrote:

> Could any one tell me how can we copy data from Cassandra-Brisk cluster to
> Hadoop-HDFS cluster?
>
> 1) Is there a way to do hadoop distcp between clusters?
> 2) If hive table is created on Brisk cluster, will it similar like HDFS
> file format? can we run map reduce on the other cluster to transform hive
> data (on brisk)?
>
> Thanks and Regards
> RK
>
>
>

Mime
View raw message