hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alejandro Abdelnur <t...@cloudera.com>
Subject Re: HBase .92 maven artifacts compiled against different releases of Hadoop
Date Fri, 11 Nov 2011 22:09:14 GMT
Eric,

Do you mean that the HBASE published POM won't have a Hadoop artifact
as a dependency?

If so, the artifact will not be usable by HBASE downstream projects
unless the developer adds his/her version of Hadoop explicitly.

IMO this is not very kosher.

It that your idea?

Thanks.

Alejandro

On Fri, Nov 11, 2011 at 1:49 PM, Eric Yang <eyang@hortonworks.com> wrote:
> My recommendation is that there is no hadoop artifact in HBase, but construct from $PREFIX/share/hadoop
class path.  There should be a primary version of Hadoop that is advised by HBase community
as officially supported.  Communities like Bigtop can advertise community certified release
with their patches.
>
> regards,
> Eric
>
> On Nov 11, 2011, at 1:31 PM, Alejandro Abdelnur wrote:
>
>> Yes, but what version of Hadoop your published hbase artifact has? And
>> how do you handle the pre-0.23 and 0.23-onwards there? How the
>> developers using hbase artifacts will deal with this?
>>
>> Thanks.
>>
>> Alejandro
>>
>> On Fri, Nov 11, 2011 at 1:24 PM, Eric Yang <eyang@hortonworks.com> wrote:
>>> This is where separated maven profiles can be useful in toggling tests with different
dependency trees for test purpose only.
>>>
>>> regards,
>>> Eric
>>>
>>> On Nov 11, 2011, at 12:26 PM, Alejandro Abdelnur wrote:
>>>
>>>> Eric,
>>>>
>>>> One problem is that you cannot depend on hadoop-core (for pre 0.23)
>>>> and on hadoop-common/hdfs/mapreduce* (for 0.23 onwards) at the same
>>>> time.
>>>>
>>>> Another problem is that different versions of hadoop bring in
>>>> different dependencies you want  to exclude, thus you have to exclude
>>>> all deps from all potential hadoop versions you don't want (to
>>>> complicate things more, jetty changed group name, thus you have to
>>>> exclude it twice)
>>>>
>>>> Thanks.
>>>>
>>>> Alejandro
>>>>
>>>> On Fri, Nov 11, 2011 at 11:54 AM, Eric Yang <eyang@hortonworks.com>
wrote:
>>>>>
>>>>>
>>>>> On Nov 11, 2011, at 11:04 AM, Gary Helmling wrote:
>>>>>
>>>>>>> Some effort was put into restore and forward porting features
to ensure HBase 0.90.x and Hadoop 0.20.205.0 can work together.  I recommend that one HBase
release should be certified for one major release of Hadoop to reduce risk.  Perhaps when
public Hadoop API are rock solid, then it will become feasible to have a version of HBase
that work across multiple version of Hadoop.
>>>>>>
>>>>>> Since 0.20.205.0 is the build default, a lot of the testing will
>>>>>> naturally take place on this combination.  But there are clearly
>>>>>> others interested in (and investing a lot of testing effort in)
>>>>>> running on 0.22 and 0.23, so we can't exclude those as unsupported.
>>>>>>
>>>>>>>
>>>>>>> In proposed HBase structure layout change (HBASE-4337), the packaging
process excludes inclusion of Hadoop jar file, and pick up from constructed class path.  In
the effort of ensuring Hadoop related technology can work together in integrated fashion (File
system layout change in HADOOP-6255).
>>>>>>
>>>>>> This is good, when the packaging system supports flexible enough
>>>>>> dependencies to allow different Hadoop versions to satisfy the package
>>>>>> "Depends:", but I don't think it gets us all the way there.
>>>>>>
>>>>>> We still want to provide tarball distributions that contain a bundled
>>>>>> Hadoop jar for easy standalone setup and testing.
>>>>>>
>>>>>> Maven dependencies seem to be the other limiting factor.  If I setup
a
>>>>>> java program that uses the HBase client and declare that dependency,
I
>>>>>> get a transitive dependency on Hadoop (good), but what version?  If
>>>>>> I'm running Hadoop 0.22, but the published maven artifact for HBase
>>>>>> depends on 205, can I override that dependency in my POM?  Or do
we
>>>>>> need to publish separate maven artifacts for each Hadoop version,
so
>>>>>> that the dependencies for each possible combination can be met (using
>>>>>> versioning or the version classifier)?
>>>>>>
>>>>>> I really don't know enough about maven dependency management.  Can
we
>>>>>> specify a version like (0.20.205.0|0.22|0.23)?  Or is there any
way
>>>>>> for Hadoop to do a "Provides:" on a virtual package name that those
3
>>>>>> can share?
>>>>>
>>>>> Maven is quite flexible in specifying dependency.  Both version range
and provided can be defined in pom.xml to improve compatibility.  Certification of individual
version of dependent component should be expressed in the integration test phase of HBase
pom.xml to ensure some version test validations can be  done in HBase builds.  If Provided
is expressed, there is no need of virtual package, ie:
>>>>>
>>>>> <dependencies>
>>>>>  <dependency>
>>>>>    <groupId>org.apache.hadoop</groupId>
>>>>>    <artifactId>hadoop-core</artifactId>
>>>>>    <version>[0.20.205.0,)</version>
>>>>>    <scope>provided</scope>
>>>>>  </dependency>
>>>>>  <dependency>
>>>>>    <groupId>org.apache.hadoop</groupId>
>>>>>    <artifactId>hadoop-common</artifactId>
>>>>>    <version>[0.22.0,)</version>
>>>>>    <scope>provided</scope>
>>>>>  </dependency>
>>>>>  <dependency>
>>>>>    <groupId>org.apache.hadoop</groupId>
>>>>>    <artifactId>hadoop-hdfs</artifactId>
>>>>>    <version>[0.22.0,)</version>
>>>>>    <scope>provided</scope>
>>>>>  </dependency>
>>>>> </dependencies>
>>>>>
>>>>> The packaging proposal is to ensure the produced packages are not fixed
to a single version of Hadoop.  It is useful for QA to run smoke test without having to make
changes to scripts for release package.
>>>>>
>>>>> regards,
>>>>> Eric
>>>
>>>
>
>

Mime
View raw message