incubator-bigtop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <ey...@hortonworks.com>
Subject Re: Packaging concerns
Date Mon, 22 Aug 2011 17:21:12 GMT

On Aug 22, 2011, at 9:02 AM, Roman Shaposhnik wrote:

> On Sat, Aug 20, 2011 at 11:10 AM, Eric Yang <eyang@hortonworks.com> wrote:
>> Greetings big top developers,
>> 
>> Bigtop has started it's own packaging customization build process
>> using Linux distributions based packaging tools to fully customize
>> Hadoop stack packages.
> 
> Just to be clear -- part of the Bigtop charter is to NOT patch the
> Apache releases. We are also NOT packaging trunks (or otherwise
> unreleased upstream bits) and for the duration of incubation we're
> going to make sure that our packages are clearly labeled as coming
> from an incubating Apache Bigtop project.
> 
>> In traditional GPL camp, meta package
>> to build source and apply patches as part of rpm/deb package construction.
>> The advantage is that you can apply hot fix to the open source related
>> source to customize to fit Linux distributions.
> 
> I'm not sure I follow. Can you, please, elaborate?

Redhat produces a kernel rpm package, which is stock kernel tar ball with thousands of patches
applied for back port and sustain features.  I thought you were going to patch Apache software
and certify them and release as bigtop distribution.  You said Bigtop is not going to patch
Apache release, hence, bigtop could easily reuse the project based .spec and control file
to be in sync with Apache projects.  If "no patch statement" is valid, then there is no need
to create bigtop's own version of package management for the projects right?

>> In Apache, software are released as tar ball with md5 signature.
> 
> Yes. And those are source tarballs. My understanding of Apache
> release process is that everything else (including a functional
> binaries) is but a convenience artifact. Some of them go into
> Maven repositories, some of them go into the same source
> release tarballs, but there's no clear process defined for
> releasing anything but a source tar ball.
> 

Apache are flexible in how the project release software to accommodate platform and language
barriers.  For Java projects, jar file md5 signature are important.  Apache maven jar files
are identical in maven repository as well as in the binary tar ball.  This is best practice
to ensure that bits downloaded through maven are really the version that the project released.
 
>> Ideally, Apache released rpm/deb packages should be the same
>> bits that is in the release tar ball.
> 
> Is there such a thing as Apache released rpm/deb packages
> for any project at all? Does Apache have infrastructure to support
> those types of packaged releases? I'm very curious to find this out,
> simply becuase it'll help Bigtop a lot if we can piggyback off of that
> sort of infrastructure and existing efforts.
> 
>>  Apache HTTPD project stopped distributing RPM form after a couple short releases.
> 
> Ok, so you're saying that Apache is not a correct place to have
> packages released?
> I'm confused now.

Apache httpd is not releasing RPM due to shortage in infrastructure and man power to maintain
a decoupled building process.  The previous attempt was using a decoupled build process from
the project, which is the wrong approach.  If there are donated servers to host yum/apt repositories,
it can easily be automated through Jenkins by building as part of the original project.

>> It seems bigtop has chosen to use traditional Redhat/Debian methodology of producing
bits
>> to fit Linux distributions.  It is a novel goal from a packaging purity perspective.
> 
> In fact this very subject has been discussed at our meetup last week
> (I really wished either
> you or Owen could have participated -- but oh well :-(). You're right
> that Bigtop has
> to decide whether to take the route of producing OS-friendly packages
> or tarball-friendly
> packages. It seem that the community is currently in favor of the
> OS-friendly ones,
> but if you have arguments in favor of the other style -- let us know.
> Especially if
> you have customer feedback that would point in that direction.
> 
> That said, the decision, as you can see, is not tied to the packaging
> infrastructure
> decisions made by various projects. I could very well imagine project A meeting
> their customer requirements with one style of packaging and project B pursuing
> a different style. That kind of diversity is fine for individual
> projects. For Bigtop
> we have to stick with one style consistently.
> 
>> However, you might want to pay close attention to license and potential pitfalls.
> 
> Can you be a bit more clear on what is it that you're referring to as
> "license and potential pitfalls"?

rpmbuild does jar file repack, you need to disable them otherwise, it will repackage exist
jar files.  This could consider the software was linked to GPL software.  Hence, it could
get you into trouble, if it is not handle properly.  The same applies to debugging symbol
strip for .so files.

> 
>> It may be more interesting to focus on testing the community produced packages in
MHO.
> 
> Where can I get these community produced packages you're referring to? What
> projects are included?

There were work done for Hadoop 0.20.204 and 0.23 (HADOOP-6255).  However, trunk version is
removed by mavenization.  Alejandro ask me to work on it again after mavenization work is
done.  HBase/pig rpm were committed a while back, ZooKeeper (ZOOKEEPER-999) package is done
and waiting for commit. HCatalog is also in progress (HCATALOG-63).  Hive is probably the
only one left that does not produce rpm/deb package.  The work are done recently, and Hadoop
is most likely to release rpm/deb packages.  It would be interesting to see the response from
Hadoop community to determine if yum repos on Apache make sense.

regards,
Eric
Mime
View raw message