hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <ey...@hortonworks.com>
Subject Re: HBase .92 maven artifacts compiled against different releases of Hadoop
Date Fri, 11 Nov 2011 19:54:28 GMT


On Nov 11, 2011, at 11:04 AM, Gary Helmling wrote:

>> Some effort was put into restore and forward porting features to ensure HBase 0.90.x
and Hadoop 0.20.205.0 can work together.  I recommend that one HBase release should be certified
for one major release of Hadoop to reduce risk.  Perhaps when public Hadoop API are rock solid,
then it will become feasible to have a version of HBase that work across multiple version
of Hadoop.
> 
> Since 0.20.205.0 is the build default, a lot of the testing will
> naturally take place on this combination.  But there are clearly
> others interested in (and investing a lot of testing effort in)
> running on 0.22 and 0.23, so we can't exclude those as unsupported.
> 
>> 
>> In proposed HBase structure layout change (HBASE-4337), the packaging process excludes
inclusion of Hadoop jar file, and pick up from constructed class path.  In the effort of ensuring
Hadoop related technology can work together in integrated fashion (File system layout change
in HADOOP-6255).
> 
> This is good, when the packaging system supports flexible enough
> dependencies to allow different Hadoop versions to satisfy the package
> "Depends:", but I don't think it gets us all the way there.
> 
> We still want to provide tarball distributions that contain a bundled
> Hadoop jar for easy standalone setup and testing.
> 
> Maven dependencies seem to be the other limiting factor.  If I setup a
> java program that uses the HBase client and declare that dependency, I
> get a transitive dependency on Hadoop (good), but what version?  If
> I'm running Hadoop 0.22, but the published maven artifact for HBase
> depends on 205, can I override that dependency in my POM?  Or do we
> need to publish separate maven artifacts for each Hadoop version, so
> that the dependencies for each possible combination can be met (using
> versioning or the version classifier)?
> 
> I really don't know enough about maven dependency management.  Can we
> specify a version like (0.20.205.0|0.22|0.23)?  Or is there any way
> for Hadoop to do a "Provides:" on a virtual package name that those 3
> can share?

Maven is quite flexible in specifying dependency.  Both version range and provided can be
defined in pom.xml to improve compatibility.  Certification of individual version of dependent
component should be expressed in the integration test phase of HBase pom.xml to ensure some
version test validations can be  done in HBase builds.  If Provided is expressed, there is
no need of virtual package, ie:

<dependencies>
  <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-core</artifactId>
    <version>[0.20.205.0,)</version>
    <scope>provided</scope>
  </dependency>
  <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-common</artifactId>
    <version>[0.22.0,)</version>
    <scope>provided</scope>
  </dependency>
  <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-hdfs</artifactId>
    <version>[0.22.0,)</version>
    <scope>provided</scope>
  </dependency>
</dependencies>

The packaging proposal is to ensure the produced packages are not fixed to a single version
of Hadoop.  It is useful for QA to run smoke test without having to make changes to scripts
for release package.

regards,
Eric
Mime
View raw message