hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Boudnik <...@apache.org>
Subject Dependencies [Was: Build environment]
Date Fri, 12 Feb 2016 06:51:40 GMT
Going through this futher. Two dependencies of HAWQ in particular seem to be a
big snag for the downstream integration. I managed to get most of the bits and
pieces into tight and compact toolchain, so that the project can be build on a
generic centos, but there at least two libs that don't fit in to the picture.
 - libhdfs3
 - thrift-devel

From Apache Bigtop standpoint, build dependencies should be either available
from the standard distro's repos; or be built from source during the
component's package creation. Say, libyarn could be build this way. However
for the two above, either the packages not available from distros or the
sources not provided as a part of the project dependencies (again, libyarn
might be considered as an example). Hence, the question to to the development
community here: why libyarn is specifically included as a source dependency,
but the other two aren't? Is it possible to add those to the depends/ and
improve the build to the point where they got build before HAWQ, so the
autoconf requirements get satisfied? 

From the UX standpoint of view, it is a very iffi practice to have a source
tree, which has non-standard deps, that have to be downloaded and build
separately. That is as a different software projects. Back in early 90s I used
to build software for BSD or Solaris, where you had to download,build,install
20+ different libs first, and only then you'd be able to finally start
building what you really need. Well, it is 2016 and the Unix world is very
different now. And way more developers' friendly. I'd suggest we borrow from
today's practices instead of the quarter-century old ones.

libhdfs3 and libyarn are also seem to be the runtime dependencies for the
HAWQ, right? Which means, that if hawq is installed as a standard package,
then those two (and perhaps thrift) will have to be provided as packages as
well. Which makes the reliance on unknown 3rd party repos a big security

So, any thoughts from the community on the steps to make this better before
the release is out? If we're to add more source code into the project - this
is the perfect time to do it.


On Thu, Feb 11, 2016 at 12:16PM, Konstantin Boudnik wrote:
> Looking a bit further into build dependencies it seems that the environment
> relies very heavily on some 3rd party, maintained by someone and somewhere
> repos and libs. While it is up to the community on how they handle their
> builds, adding repos and packages which aren't maintained in an open fashion,
> won't be helping to attract new contributors. Cases in point
>   - https://bintray.com/wangzw/rpm/rpm
>   - http://darcs.idyll.org/~t/projects/figleaf-0.6.1.tar.gz
>   -  http://sourceforge.net/projects/pychecker/files/pychecker/0.8.19/pychecker-0.8.19.tar.gz/download
> This certainly will be an integration blocker for the downstream projects like
>     https://issues.apache.org/jira/browse/BIGTOP-2323
> One other point (which seems a bit weird). What's the point of 
>     yum install -y postgresql-devel
> followed by
>     yum erase -y postgresql postgresql-libs postgresql-devel
> Thanks
>   Cos
> On Tue, Feb 09, 2016 at 06:06AM, Konstantin Boudnik wrote:
> > Gents,
> > 
> > have you considered creating and checking-in a wrapper script (run-build.sh or
> > whatever) for the build, instead of writing lengthy shell-scripts in Jenkins?
> > Then instead of 
> > 
> > docker run --rm=true -v `pwd`:/data -u root rlei/mydocker:latest /bin/sh -c
> > "date; \
> > 
> > cd /data....
> > 
> > [another 23 lines of shell script one has to type each time.....]
> > 
> > one would need to run 
> > 
> > docker run --rm=true -v `pwd`:/data -w /data -u root rlei/mydocker:latest run-build.sh
> > 
> > and be done with it.
> > 
> > Cos
> > 
> > 

View raw message