hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-9213) create a unified shim for hadoop 1 and 2 so that there's one build of HBase
Date Thu, 15 Aug 2013 05:21:48 GMT

    [ https://issues.apache.org/jira/browse/HBASE-9213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13740695#comment-13740695
] 

stack commented on HBASE-9213:
------------------------------

[~sershe] A single build would be coolio.  If you have ideas for how to make it work, I am
all ears.

Does the hive shim work well?  It strikes me as a horror to keep up.  Our compat modules that
Elliott made are hard enough and these are just meshing APIs.

We need to build tgzs and publish to maven.  We need to do it for hadoop1 and hadoop2.  The
tgz needs to include all dependencies of which there are quiet a few when you are running
on hadoop2.  The dependencies are ill-specified in associated poms overly-cautious pulling
in way more than is needed in the name of "just-in-case".  hadoop1 and hadoop2 and their dependencies
likely need to be siloed (We might do this in a subdir in a tgz).

Remember also that the hadoop we ship w/ is likely moot anyways as it is unlikely to match
what the user is running; user has to replace the hadoop we ship w/ the hadoop they are running.

Thats the tgz.

Then there is publishing to maven.  When we publish to maven we say what we depend on in the
associated pom we publish.  The vocabulary available to you when you are doing maven publishing
is limited, cryptic, broken (as best as I can discern), and there is no means of flipping
a switch to say "I am currently dependent on hadoop1 (as opposed to hadoop2)" when downstream
dependencies are doing their dependency pull.

So, after messing w/ the maven arcane -- e.g. classifications, trying to set properties/profiles
at publish and dependency fullfillment time, etc. (of note, each plugin is written by a different
crew w/o enforcement of how to name the property that refers to a particular attribute so
each plugin can name it as it will, and then when it comes to corner-facility such as classification,
plugins may implement or not so you have 'interesting' cases such as classifications works
for near all of the build pipeline but NOT for the final assembly plugin step, the plugin
that makes the tgzs....or the release plugin doesn't even know the name of the pom that it
is supposed to be reading though you can pass it on the command line to mvn and all other
plugins are fine w/ that making it so you have to tell this plugin what pom to use via gymnastics),
we've ended up w/ our current hokey system where our build can be set against a target hadoop
using maven profiles which works fine for local builds or builds up on the build box for unit
tests etc., but it is lacking when it comes time to publish.  Publishing, we need to generate
two different artifacts and we denote them by adding -hadoop1 and -hadoop2 to our version.
 I could not make mvn do this for us so I made the script to do it working off the committed
poms.

On hbase-common, we could likely have a single jar that would work with both hadoop1 and hadoop2.
 As Elliott says, we haven't done the work (it could be just a simple hack in the script over
in HBASE-8224).  I've not tried it (it didn't occur to me -- it is a good idea).  The prefix-tree
module could likely drop the hadoop1 and hadoop2 suffix.


[~roshan_naik] Yes on 1.  See publish SNAPSHOTS for examples (let us know if the recent ones
do not work for you -- we have heard from others that they do work as dependencies for downstreamers
so we are thinking we are good here until we hear otherwise).  On 2., yes... all artifacts
will have the -hadoop2 and -hadoop1 appended but you probably won't have to worry because
they will be pulled in for you by maven (we did some tests to ensure the right dependencies
come in).  Let us know if it isn't working for you.  Thanks.
                
> create a unified shim for hadoop 1 and 2 so that there's one build of HBase
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-9213
>                 URL: https://issues.apache.org/jira/browse/HBASE-9213
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: build
>            Reporter: Sergey Shelukhin
>             Fix For: 0.96.0
>
>
> This is a brainstorming JIRA. Working with HBase dependency at this point seems to be
rather painful from what I hear from other folks. We could do the hive model with unified
shim, built in such manner that it can work with either version, where at build time dependencies
for all 2-3 versions are pulled and the appropriate one is used for tests, and when running
HBase you have to point at Hadoop directory to get the dependencies. I am not very proficient
at maven so not quite certain of the best solution yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message