hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Elek, Marton (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-13108) Ozone: OzoneFileSystem: Simplified url schema for Ozone File System
Date Sun, 11 Feb 2018 11:40:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-13108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16359912#comment-16359912

Elek, Marton commented on HDFS-13108:

PS: The patch could be applied on the top of HDFS-12735.

> Ozone: OzoneFileSystem: Simplified url schema for Ozone File System
> -------------------------------------------------------------------
>                 Key: HDFS-13108
>                 URL: https://issues.apache.org/jira/browse/HDFS-13108
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>    Affects Versions: HDFS-7240
>            Reporter: Elek, Marton
>            Assignee: Elek, Marton
>            Priority: Major
>         Attachments: HDFS-13108-HDFS-7240.001.patch
> A. Current state
> 1. The datanode host / bucket /volume should be defined in the defaultFS (eg.  o3://datanode:9864/test/bucket1)
> 2. The root file system points to the bucket (eg. 'dfs -ls /' lists all the keys from
the bucket1)
> It works very well, but there are some limitations.
> B. Problem one 
> The current code doesn't support fully qualified locations. For example 'dfs -ls o3://datanode:9864/test/bucket1/dir1'
is not working.
> C.) Problem two
> I tried to fix the previous problem, but it's not trivial. The biggest problem is that
there is a Path.makeQualified call which could transform unqualified url to qualified url.
This is part of the Path.java so it's common for all the Hadoop file systems.
> In the current implementations it qualifies an url with keeping the schema (eg. o3://
) and authority (eg: datanode: 9864) from the defaultfs and use the relative path as the end
of the qualified url. For example:
> makeQualfied(defaultUri=o3://datanode:9864/test/bucket1, path=dir1/file) will return
o3://datanode:9864/dir1/file which is obviously wrong (the good would be o3://datanode:9864/TEST/BUCKET1/dir1/file).
I tried to do a workaround with using a custom makeQualified in the Ozone code and it worked
from command line but couldn't work with Spark which use the Hadoop api and the original makeQualified
> D.) Solution
> We should support makeQualified calls, so we can use any path in the defaultFS.
> I propose to use a simplified schema as o3://bucket.volume/ 
> This is similar to the s3a  format where the pattern is s3a://bucket.region/ 
> We don't need to set the hostname of the datanode (or ksm in case of service discovery)
but it would be configurable with additional hadoop configuraion values such as fs.o3.bucket.buckename.volumename.address=http://datanode:9864
(this is how the s3a works today, as I know).
> We also need to define restrictions for the volume names (in our case it should not include
dot any more).
> ps: some spark output
> 2018-02-03 18:43:04 WARN  Client:66 - Neither spark.yarn.jars nor spark.yarn.archive
is set, falling back to uploading libraries under SPARK_HOME.
> 2018-02-03 18:43:05 INFO  Client:54 - Uploading resource file:/tmp/spark-03119be0-9c3d-440c-8e9f-48c692412ab5/__spark_libs__2440448967844904444.zip
-> o3://datanode:9864/user/hadoop/.sparkStaging/application_1517611085375_0001/__spark_libs__2440448967844904444.zip
> My default fs was o3://datanode:9864/test/bucket1, but spark qualified the name of the
home directory.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message