hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-13108) Ozone: OzoneFileSystem: Simplified url schema for Ozone File System
Date Thu, 26 Apr 2018 21:58:08 GMT

    [ https://issues.apache.org/jira/browse/HDFS-13108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16455396#comment-16455396
] 

Hudson commented on HDFS-13108:
-------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14070 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14070/])
HDFS-13108. Ozone: OzoneFileSystem: Simplified url schema for Ozone File (omalley: rev 61651dcf5c08d2fc0fed1b1ad99b863bd21ca66f)
* (edit) hadoop-tools/hadoop-ozone/src/test/java/org/apache/hadoop/fs/ozone/contract/OzoneContract.java
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/web/interfaces/StorageHandler.java
* (edit) hadoop-tools/hadoop-ozone/src/main/java/org/apache/hadoop/fs/ozone/OzoneFileSystem.java
* (edit) hadoop-tools/hadoop-ozone/src/test/java/org/apache/hadoop/fs/ozone/TestOzoneFileInterfaces.java


> Ozone: OzoneFileSystem: Simplified url schema for Ozone File System
> -------------------------------------------------------------------
>
>                 Key: HDFS-13108
>                 URL: https://issues.apache.org/jira/browse/HDFS-13108
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>    Affects Versions: HDFS-7240
>            Reporter: Elek, Marton
>            Assignee: Elek, Marton
>            Priority: Major
>             Fix For: HDFS-7240
>
>         Attachments: HDFS-13108-HDFS-7240.001.patch, HDFS-13108-HDFS-7240.002.patch,
HDFS-13108-HDFS-7240.003.patch, HDFS-13108-HDFS-7240.005.patch, HDFS-13108-HDFS-7240.006.patch,
HDFS-13108-HDFS-7240.007.patch
>
>
> A. Current state
>  
> 1. The datanode host / bucket /volume should be defined in the defaultFS (eg.  o3://datanode:9864/test/bucket1)
> 2. The root file system points to the bucket (eg. 'dfs -ls /' lists all the keys from
the bucket1)
> It works very well, but there are some limitations.
> B. Problem one 
> The current code doesn't support fully qualified locations. For example 'dfs -ls o3://datanode:9864/test/bucket1/dir1'
is not working.
> C.) Problem two
> I tried to fix the previous problem, but it's not trivial. The biggest problem is that
there is a Path.makeQualified call which could transform unqualified url to qualified url.
This is part of the Path.java so it's common for all the Hadoop file systems.
> In the current implementations it qualifies an url with keeping the schema (eg. o3://
) and authority (eg: datanode: 9864) from the defaultfs and use the relative path as the end
of the qualified url. For example:
> makeQualfied(defaultUri=o3://datanode:9864/test/bucket1, path=dir1/file) will return
o3://datanode:9864/dir1/file which is obviously wrong (the good would be o3://datanode:9864/TEST/BUCKET1/dir1/file).
I tried to do a workaround with using a custom makeQualified in the Ozone code and it worked
from command line but couldn't work with Spark which use the Hadoop api and the original makeQualified
path.
> D.) Solution
> We should support makeQualified calls, so we can use any path in the defaultFS.
>  
> I propose to use a simplified schema as o3://bucket.volume/ 
> This is similar to the s3a  format where the pattern is s3a://bucket.region/ 
> We don't need to set the hostname of the datanode (or ksm in case of service discovery)
but it would be configurable with additional hadoop configuraion values such as fs.o3.bucket.buckename.volumename.address=http://datanode:9864
(this is how the s3a works today, as I know).
> We also need to define restrictions for the volume names (in our case it should not include
dot any more).
> ps: some spark output
> 2018-02-03 18:43:04 WARN  Client:66 - Neither spark.yarn.jars nor spark.yarn.archive
is set, falling back to uploading libraries under SPARK_HOME.
> 2018-02-03 18:43:05 INFO  Client:54 - Uploading resource file:/tmp/spark-03119be0-9c3d-440c-8e9f-48c692412ab5/__spark_libs__2440448967844904444.zip
-> o3://datanode:9864/user/hadoop/.sparkStaging/application_1517611085375_0001/__spark_libs__2440448967844904444.zip
> My default fs was o3://datanode:9864/test/bucket1, but spark qualified the name of the
home directory.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message