hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Elek, Marton (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDDS-922) Create isolated classloder to use ozonefs with any older hadoop versions
Date Fri, 01 Feb 2019 13:36:00 GMT

     [ https://issues.apache.org/jira/browse/HDDS-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Elek, Marton updated HDDS-922:
    Attachment: HDDS-922.003.patch

> Create isolated classloder to use ozonefs with any older hadoop versions
> ------------------------------------------------------------------------
>                 Key: HDDS-922
>                 URL: https://issues.apache.org/jira/browse/HDDS-922
>             Project: Hadoop Distributed Data Store
>          Issue Type: Improvement
>          Components: Ozone Filesystem
>            Reporter: Elek, Marton
>            Assignee: Elek, Marton
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HDDS-922.001.patch, HDDS-922.002.patch, HDDS-922.003.patch
>          Time Spent: 10m
>  Remaining Estimate: 0h
> As of now we create a shaded ozonefs artifact which includes all the required class files
to use ozonefs (Hadoop compatible file system for Ozone)
> But the shading process of this artifact is very easy, it includes all the class files
but no relocation rules (package name renaming) are configured. With this approach ozonefs
can be used from the compatible hadoop version (this is hadoop 3.1 only, I guess) but can't
be used with any older hadoop version as it requires the newer version of hadoop-common.
> I tried to configure a full shading (with relocation) but it's not a simple task. For
example a pure (non-relocated) Configuration is required by the ozonefs itself, but an other,
newer Configuration class is required by the ozone client code which is a dependency of OzoneFileSystem
So we need a relocated and a non-relocated class in the same time.
> I tried out a different approach: I moved out all of the ozone specific classes from
the OzoneFileSystem to an adapter class (OzoneClientAdapter). In case of an older hadoop version
the adapter class itself can be loaded with an isolated classloader. The isolated classloader
can load all the required classes from the jar file from a specific path. It doesn't require
any specific package relocation as the default class loader doesn't load these classes. 
> The OzoneFileSystem (in case of older hadoop version) can load the adapter with the isolated
classloader and only a few classes should be shared between the normal and isolated classloader
(the interface of the adapter and the types in the method signatures). All of the other ozone
classes and the newer hadoop dependencies will be hidden by the isolated classloader.
> This patch is more like a proof of concept, I would like to start a discussion about
this approach. I successfully used the generated artifact to use ozonefs from spark 2.4 default
distribution (which includes hadoop 2.7). 
> For a final patch I would add some check to use the ozonefs without any classpath separation
by default. (could be configured or chosen by automatically)
> For using spark (+ hadoop 2.7 + kubernetes scheduler) together with ozone, you can check
this screencast: https://www.youtube.com/watch?v=cpRJcSHIEdM&t=8s

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message