hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mingliang Liu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11156) Webhdfs rest api GET_BLOCK_LOCATIONS output doesn't comply with FileSystem API
Date Tue, 29 Nov 2016 17:24:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15705922#comment-15705922
] 

Mingliang Liu commented on HDFS-11156:
--------------------------------------

They are two different problems.

P1. The mismatch of REST {{getFileBlockLocations}} with FileSystem API is a public/well-known
feature which was designed to be this.

The general contract of REST API, as Weiwei pointed out above, is to support the FileSystem
interface. So I believe this is not the case. Moreover, mismatching APIs should be well-documented
which is extremely useful to new users. {{GET_BLOCK_LOCATIONS}} is a private unstable op per
its documentation, see [here|https://github.com/apache/hadoop/blob/d8bab3dcb693b2773ede9a6e4f71ae85ee056f79/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/resources/GetOpParam.java#L37-L37].
It's even not on the page of WebHDFS REST API documentation.

P2. The mismatch was an omission but we don't want to fix it for the sake of wire compatibility.

 I can understand that our users may have been using this and it's considered to some degree
public/stable. I agree with the community decision that we should break the wire compatibility
only for a good reason. I propose that we create a new OP named {{GETFILEBLOCKLOCATIONS}}
for REST API. The name pattern follows other REST APIs (no underscores); it returns {{BlockLocations[]}}
as other FileSystem does. {{WebHdfsFileSystem}} should not be affected as it returns {{BlockLocation[]}}
anyway.

Thanks,

> Webhdfs rest api GET_BLOCK_LOCATIONS output doesn't comply with FileSystem API
> ------------------------------------------------------------------------------
>
>                 Key: HDFS-11156
>                 URL: https://issues.apache.org/jira/browse/HDFS-11156
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: webhdfs
>    Affects Versions: 2.7.3
>            Reporter: Weiwei Yang
>            Assignee: Weiwei Yang
>         Attachments: HDFS-11156.01.patch, HDFS-11156.02.patch, HDFS-11156.03.patch, HDFS-11156.04.patch
>
>
> Following webhdfs REST API
> {code}
> http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=GET_BLOCK_LOCATIONS&offset=0&length=1
> {code}
> will get a response like
> {code}
> {
>   "LocatedBlocks" : {
>     "fileLength" : 1073741824,
>     "isLastBlockComplete" : true,
>     "isUnderConstruction" : false,
>     "lastLocatedBlock" : { ... },
>     "locatedBlocks" : [ {...} ]
>   }
> }
> {code}
> This represents for *o.a.h.h.p.LocatedBlocks*. However according to *FileSystem* API,

> {code}
> public BlockLocation[] getFileBlockLocations(Path p, long start, long len)
> {code}
> clients would expect an array of BlockLocation. This mismatch should be fixed. Marked
as Incompatible change as this will change the output of the GET_BLOCK_LOCATIONS API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message