hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bram Biesbrouck...@beligum.com>
Subject Found weird issue with HttpFS and WebHdfsFileSystem
Date Thu, 16 Apr 2015 14:58:33 GMT
Hi all,

I'm experiencing something strange while developing against the HttpFS
front-end webapp on Hadoop 2.6.0.

I'm currently digging into WebHdfsFileSystem and HttpFS to understand it
better and understand how the rest api works. I've setup a local single
node Hadoop instance, which I can query successfully with eg.
http://localhost:50070/webhdfs/v1/?op=LISTSTATUS
Returning eg. this FileStatus object:

{
accessTime: 0,
blockSize: 0,
childrenNum: 0,
fileId: 16386,
group: "supergroup",
length: 0,
modificationTime: 1417964248854,
owner: "hadoop",
pathSuffix: "user",
permission: "755",
replication: 0,
storagePolicy: 0,
type: "DIRECTORY"
}

Now, when I start HttpFS and ask for the same data over it's interface (
http://localhost:14000/webhdfs/v1/?op=LISTSTATUS), I get a different reply.
Especially, the childrenNum and fileId fields are missing, compared to the
first result (same file or directory):

{
pathSuffix: "user",
type: "DIRECTORY",
length: 0,
owner: "hadoop",
group: "supergroup",
permission: "755",
accessTime: 0,
modificationTime: 1417964248854,
blockSize: 0,
replication: 0
}

Since I need the childrenNum property, I started digging into the code to
see where it's "lost" and found that WebHdfsFileSystem performs a
makeQualified() step (around line 1287 in WebHdfsFileSystem.java), just
before the list of filestatuses is returned. Basically, it converts
HdfsFileStatus objects into FileStatus objects, effectively chopping off
those two properties.

The sources for HdfsFileStatus clearly state that it's an "Interface that
represents the over the wire information for a file.", so I wonder why this
happens, since the HdfsFileStatus contains all the right properties,
according to the docs at
http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#List_a_Directory

It feels like the FileStatus class hasn't been updated to match the
HdfsFileStatus class, but since they don't share any interfaces or
superclasses I get the feeling it's intentional, but I just can't find or
figure out why.

Can somebody help or shed some light?

thanks,

b.
-- 

 Bram Biesbrouck - 0486/118280 - www.beligum.com -  the republic of
reinvention

Mime
View raw message