hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Uma Maheswara Rao G 72686 <mahesw...@huawei.com>
Subject Re: About block name and location.
Date Tue, 18 Oct 2011 02:51:21 GMT
----- Original Message -----
From: Yuduo Zhou <yuduozhou@gmail.com>
Date: Tuesday, October 18, 2011 6:30 am
Subject: About block name and location.
To: hdfs-user@hadoop.apache.org

> Hi all,
> I'm a rookie to HDFS. Here is just a quick question, suppose I have 
> a big file stored in HDFS, is there any way to generate a file 
> containing all information about blocks belong to this file? 
> For example list of records with format of "block_id, length, 
> offset, hosts[], local/path/to/this/block"?
FileSystem#getFileStatus(Path f) will give some information. FileStatus contains below parameters
to get.

Path path;
long length;
boolean isdir;
short block_replication;
long blocksize;
long modification_time;
long access_time;
FsPermission permission;
String owner;
String group;
Path symlink;

And to get the blcok locations nd offsets you can use FileSystem#getFileBlockLocations

If you want exactly in your format, i would suggest you to write small wrapper in your app
and format it using above APIs.

> The purpose is to enable programs to only access blocks on the same 
> node, to utilize block locality.
Hadoop already supports it.
> I can retrieve most information using getFileBlockLocations() but I 
> didn't find how to gather information about the local path.
AFAIK, Local files will be written as just normal file. So, hadoop will not split local files
into blocks. It will do that only in DFS case.
> Thanks,
> Yuduo

View raw message