hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-11887) Introduce Intel ISA-L erasure coding library for the native support
Date Wed, 15 Jul 2015 17:52:06 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14628449#comment-14628449
] 

Colin Patrick McCabe commented on HADOOP-11887:
-----------------------------------------------

bq. I totally got your concern. I'm using the general name erasurecode instead of isal directly
because I wish the overall work won't couple with ISA-L too tightly. In future other native
libraries like Jerasure or even hardware based ones could also be supported as well without
too much change. I'm thinking that the native APIs defined in erasure_code.h should be general
enough so other native libraries could also be easily mapped to it, thus when building, other
libraries could also be passed to via the mentioned options. Note require.erasurecode is used
to enable it, and if enabled, erasurecode.prefix should be specified to provide the library
place; If not enabled (by default for now), the building should go as normally, and the result
won't contain any erasure code related symbols. The logic is similar to existing codes like
for snappy library.

I think you are confusing two different things: how to configure ISA-L, and supporting multiple
different erasure encoding libraries.

ISA-L configuration includes:
* Where to find it (\-Disal.prefix, \-Disal.lib)
* Whether to bundle it (\-Dbundle.isal)
* Whether to fail the build if it is not found (\-Dbundle.isal)

These things should have "ISA-L" in the name since they pertain only to that library, and
not to any other libraries.  Naming them "erasurecode" rather than "isal" will actually make
it hard to support more than one erasure encoding library in the future, since we would only
have one set of configuration knobs for both libraries, whereas we would need at least two.

If you want to support multiple erasure encoding libraries, you will need some kind of codec
interface.  This is in addition to however you would configure the other libraries, not in
replacement of it.

In any case, I think it would be unwise to try to write a plugin interface until we have support
for at least one other erasure encoding library.  Let's keep the scope of this JIRA focused
just on ISA-L.  If people want to come back later and add more libraries, they certainly can.

bq. You found another place I need to change. Yes I need to add an entry for erasrue code
in the tool. The question here is, I'm wondering if it can serve the purpose of the new tool
here, because executing of hadoop checknative may need some configuration or tweak to make
it work, the new tool can directly run just after it's out, so can be used in maven unit tests
cleanly. I understand introducing a new tool just for ONE native test may be too heavy, if
you agree, maybe we could go simple, say, if no native library is available, the native test
program could just exit with a warning message? Do we need more native tests anyway in future?
If so the checking with the new tool may sound more reasonable.

Please don't add a new tool just for this.  Add support to {{hadoop checknative}}.  If "hadoop
checknative... need\[s\] some configuration or tweak to make it work" then the admin should
know that their libraries are not being properly found.  This is important information.

Thanks

> Introduce Intel ISA-L erasure coding library for the native support
> -------------------------------------------------------------------
>
>                 Key: HADOOP-11887
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11887
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: io
>            Reporter: Kai Zheng
>            Assignee: Kai Zheng
>         Attachments: HADOOP-11887-v1.patch, HADOOP-11887-v2.patch, HADOOP-11887-v3.patch,
HADOOP-11887-v4.patch
>
>
> This is to introduce Intel ISA-L erasure coding library for the native support, via dynamic
loading mechanism (dynamic module, like *.so in *nix and *.dll on Windows).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message