hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yuduo <yuduoz...@gmail.com>
Subject Re: About block name and location.
Date Tue, 18 Oct 2011 03:05:11 GMT
Thanks, Uma! I'll try to figure it out according to your direction.

On 10/17/2011 10:51 PM, Uma Maheswara Rao G 72686 wrote:
> ----- Original Message -----
> From: Yuduo Zhou<yuduozhou@gmail.com>
> Date: Tuesday, October 18, 2011 6:30 am
> Subject: About block name and location.
> To: hdfs-user@hadoop.apache.org
>> Hi all,
>> I'm a rookie to HDFS. Here is just a quick question, suppose I have
>> a big file stored in HDFS, is there any way to generate a file
>> containing all information about blocks belong to this file?
>> For example list of records with format of "block_id, length,
>> offset, hosts[], local/path/to/this/block"?
> FileSystem#getFileStatus(Path f) will give some information. FileStatus contains below
parameters to get.
> Path path;
> long length;
> boolean isdir;
> short block_replication;
> long blocksize;
> long modification_time;
> long access_time;
> FsPermission permission;
> String owner;
> String group;
> Path symlink;
> And to get the blcok locations nd offsets you can use FileSystem#getFileBlockLocations
> If you want exactly in your format, i would suggest you to write small wrapper in your
app and format it using above APIs.
>> The purpose is to enable programs to only access blocks on the same
>> node, to utilize block locality.
> Hadoop already supports it.
>> I can retrieve most information using getFileBlockLocations() but I
>> didn't find how to gather information about the local path.
> AFAIK, Local files will be written as just normal file. So, hadoop will not split local
files into blocks. It will do that only in DFS case.
>> Thanks,
>> Yuduo

View raw message