hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-9213) create a unified shim for hadoop 1 and 2 so that there's one build of HBase
Date Thu, 15 Aug 2013 17:36:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-9213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13741243#comment-13741243

Sergey Shelukhin commented on HBASE-9213:

[~eclark] Can you elaborate? Cannot we do it in respective shims/add a proxy class/...? Granted,
HBase uses HDFS a lot more than Hive,  but it should still be doable.

bq. We need to build tgzs and publish to maven. We need to do it for hadoop1 and hadoop2.
The tgz needs to include all dependencies of which there are quiet a few when you are running
on hadoop2. The dependencies are ill-specified in associated poms overly-cautious pulling
in way more than is needed in the name of "just-in-case". hadoop1 and hadoop2 and their dependencies
likely need to be siloed (We might do this in a subdir in a tgz).
Do we really need to ship all the dependencies? User would point us to their Hadoop anyway
as you mention below. So no need to ship our own Hadoop jars.

bq. Then there is publishing to maven. When we publish to maven we say what we depend on in
the associated pom we publish. The vocabulary available to you when you are doing maven publishing
is limited, cryptic, broken (as best as I can discern), and there is no means of flipping
a switch to say "I am currently dependent on hadoop1 (as opposed to hadoop2)" when downstream
dependencies are doing their dependency pull.
As far as I understand you depend on both. So when somebody pulls you for their own build
you also pull both. Then shim is pointed at the correct ones based on your local build flags.
Shim always sees just one set (from where it's pointed to in local tests, or from where the
user pointed it to in production), figures out which one it is and initializes itself accordingly.
It is not the prettiest thing, but reliable (if shim cannot recognize the version then chances
are you are broken wrt it anyway), and avoids two builds (and maven arcane :))

bq. On hbase-common, we could likely have a single jar that would work with both hadoop1 and
hadoop2. As Elliott says, we haven't done the work (it could be just a simple hack in the
script over in HBASE-8224). I've not tried it (it didn't occur to me – it is a good idea).
The prefix-tree module could likely drop the hadoop1 and hadoop2 suffix.
The jars have different size, so it looks like they are indeed different...

bq. all artifacts will have the -hadoop2 and -hadoop1 appended but you probably won't have
to worry because they will be pulled in for you by maven (we did some tests to ensure the
right dependencies come in). Let us know if it isn't working for you. Thanks.
But you would need to include a particular one still, right?

> create a unified shim for hadoop 1 and 2 so that there's one build of HBase
> ---------------------------------------------------------------------------
>                 Key: HBASE-9213
>                 URL: https://issues.apache.org/jira/browse/HBASE-9213
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: build
>            Reporter: Sergey Shelukhin
>             Fix For: 0.96.0
> This is a brainstorming JIRA. Working with HBase dependency at this point seems to be
rather painful from what I hear from other folks. We could do the hive model with unified
shim, built in such manner that it can work with either version, where at build time dependencies
for all 2-3 versions are pulled and the appropriate one is used for tests, and when running
HBase you have to point at Hadoop directory to get the dependencies. I am not very proficient
at maven so not quite certain of the best solution yet.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message