hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Weiwei Yang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12459) Fix revert: Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API
Date Tue, 07 Nov 2017 02:55:01 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16241400#comment-16241400
] 

Weiwei Yang commented on HDFS-12459:
------------------------------------

Hi [~shahrs87]

Thanks for taking time to review this.

bq. We don't use GETFILEBLOCKLOCATIONS in WebHdfsFileSystem. Instead we use GET_BLOCK_LOCATIONS
to fetch {{WebHdfsFileSystem#getFileBlockLocations}}.

The purpose of HDFS-11156 was to fix webhdfs GETFILEBLOCKLOCATIONS API to be consistent with
file system specification. GET_BLOCK_LOCATIONS was intentionally for private/internal use
so we don't expose this to user.  WebHDFS.md is the document for webhdfs, from this point
of view, it supports parameter GETFILEBLOCKLOCATIONS to query for block locations. However
in {{WebHdfsFileSystem}}, what you said is true, it calls the internal parameter GET_BLOCK_LOCATIONS
for implementation, this implementation detail is hidden from user. From user's perspective,
it still gets {{BlockLocation[]}} via {{WebHdfsFileSystem#getFileBlockLocations}} call.

The patch in HDFS-11156 has modified {{WebHdfsFileSystem#getFileBlockLocations}} to retrieve
block locations via HTTP parameter GETFILEBLOCKLOCATIONS and falls back to GET_BLOCK_LOCATIONS
when it works with an older server. But per your comment in HDFS-11156, I have removed those
changes. I am OK to add them back if you think this is better. Please let me know.

bq. I understand we have TreeMap all over that util class. But we shouldn't follow that bad
practice. We can change it to HashMap.

Changed to HashMap.

bq. Lets write the test case ...

Addressed.

bq. I don't see any value in logging the request and response.

Removed both.

Please let me know your point, especially on comment 1. Thanks a lot.

> Fix revert: Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API
> ----------------------------------------------------------------
>
>                 Key: HDFS-12459
>                 URL: https://issues.apache.org/jira/browse/HDFS-12459
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: webhdfs
>            Reporter: Weiwei Yang
>            Assignee: Weiwei Yang
>         Attachments: HDFS-12459.001.patch, HDFS-12459.002.patch, HDFS-12459.003.patch
>
>
> HDFS-11156 was reverted because the implementation was non optimal, based on the suggestion
from [~shahrs87], we should avoid creating a dfs client to get block locations because that
create extra RPC call. Instead we should use {{NamenodeProtocols#getBlockLocations}} then
covert {{LocatedBlocks}} to {{BlockLocation[]}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message