hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "feng xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8246) Get HDFS file name based on block pool id and block id
Date Fri, 05 Jun 2015 22:24:01 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575343#comment-14575343
] 

feng xu commented on HDFS-8246:
-------------------------------

At least this feature can help security software in local file system to trace IOs back to
HDFS name space, understand the context better and take actions more accurate, which is very
useful.

On Windows how about the Volume Management Control Code  FSCTL_LOOKUP_STREAM_FROM_CLUSTER
and the command “fsutil volume querycluster”? On Unix/Linux it’s file system specific,
I think some file systems have fsdb tool, for example “xfs_db blockuse” on xfs?


> Get HDFS file name based on block pool id and block id
> ------------------------------------------------------
>
>                 Key: HDFS-8246
>                 URL: https://issues.apache.org/jira/browse/HDFS-8246
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: HDFS, hdfs-client, namenode
>            Reporter: feng xu
>            Assignee: feng xu
>              Labels: BB2015-05-TBR
>         Attachments: HDFS-8246.0.patch
>
>
> This feature provides HDFS shell command and C/Java API to retrieve HDFS file name based
on block pool id and block id.
> 1. The Java API in class DistributedFileSystem
> public String getFileName(String poolId, long blockId) throws IOException
> 2. The C API in hdfs.c
> char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
> 3. The HDFS shell command 
>  hdfs dfs [generic options] -fn <poolId> <blockId>
> This feature is useful if you have HDFS block file name in local file system and want
to  find out the related HDFS file name in HDFS name space (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
 Each HDFS block file name in local file system contains both block pool id and block id,
for sample HDFS block file name /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
 the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id is 1073741825.
The block  pool id is uniquely related to a HDFS name node/name space,  and the block id is
uniquely related to a HDFS file within a HDFS name node/name space, so the combination of
block pool id and a block id is uniquely related a HDFS file name. 
> The shell command and C/Java API do not map the block pool id to name node, so it’s
user’s responsibility to talk to the correct name node in federation environment that has
multiple name nodes. The block pool id is used by name node to check if the user is talking
with the correct name node.
> The implementation is straightforward. The client request to get HDFS file name reaches
the new method String getFileName(String poolId, long blockId) in FSNamesystem in name node
through RPC,  and the new method does the followings,
> (1)	Validate the block pool id.
> (2)	Create Block  based on the block id.
> (3)	Get BlockInfoContiguous from Block.
> (4)	Get BlockCollection from BlockInfoContiguous.
> (5)	Get file name from BlockCollection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message