hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Boudnik <...@apache.org>
Subject Re: Dependencies [Was: Build environment]
Date Tue, 16 Feb 2016 11:46:51 GMT
On Tue, Feb 16, 2016 at 04:38PM, Lili Ma wrote:
> Actually, the thrift header file and library are used in parquet related
> code.

Then we need to find a better way of handling this mandatory dependency.
A sane user, esp. an enterprise one, won't grab an unknown, unsigned library
from a god-forsaken repo to install it on premise.

Cos

> Parquet is a table storage format you can specify when creating a table,
> just like row and column. It's an open source format and the metadata
> information are encoded in thrift.
> 
> Thanks
> Lili
> 
> On Tue, Feb 16, 2016 at 3:52 PM, Konstantin Boudnik <cos@apache.org> wrote:
> 
> > On Tue, Feb 16, 2016 at 12:44PM, Lei Chang wrote:
> > > From simplifying the build & install process standpoint, looks a good
> > idea
> > > to include libhdfs3 just like libyarn.
> >
> > Indeed. A quick look further into the code shows, that you guys mightn't
> > need
> > thrift-devel libraries after all, because native thrift APIs aren't used
> > anywhere (kudos to Roman's findings).
> >
> > It'd be great to have these issues fixed before the release time. Here's
> > why:
> > making an official release to have mandatory _source_ dependencies
> > developed
> > and hosted elsewhere doesn't sound reasonable nor safe. So, let's please
> > address it before RC5 hits the voting floor.
> >
> > Thanks!
> >   Cos
> >
> >
> > > On Fri, Feb 12, 2016 at 2:51 PM, Konstantin Boudnik <cos@apache.org>
> > wrote:
> > >
> > > > Going through this futher. Two dependencies of HAWQ in particular seem
> > to
> > > > be a
> > > > big snag for the downstream integration. I managed to get most of the
> > bits
> > > > and
> > > > pieces into tight and compact toolchain, so that the project can be
> > build
> > > > on a
> > > > generic centos, but there at least two libs that don't fit in to the
> > > > picture.
> > > >  - libhdfs3
> > > >  - thrift-devel
> > > >
> > > > From Apache Bigtop standpoint, build dependencies should be either
> > > > available
> > > > from the standard distro's repos; or be built from source during the
> > > > component's package creation. Say, libyarn could be build this way.
> > However
> > > > for the two above, either the packages not available from distros or
> > the
> > > > sources not provided as a part of the project dependencies (again,
> > libyarn
> > > > might be considered as an example). Hence, the question to to the
> > > > development
> > > > community here: why libyarn is specifically included as a source
> > > > dependency,
> > > > but the other two aren't? Is it possible to add those to the depends/
> > and
> > > > improve the build to the point where they got build before HAWQ, so the
> > > > autoconf requirements get satisfied?
> > > >
> > > > From the UX standpoint of view, it is a very iffi practice to have a
> > source
> > > > tree, which has non-standard deps, that have to be downloaded and build
> > > > separately. That is as a different software projects. Back in early
> > 90s I
> > > > used
> > > > to build software for BSD or Solaris, where you had to
> > > > download,build,install
> > > > 20+ different libs first, and only then you'd be able to finally start
> > > > building what you really need. Well, it is 2016 and the Unix world is
> > very
> > > > different now. And way more developers' friendly. I'd suggest we borrow
> > > > from
> > > > today's practices instead of the quarter-century old ones.
> > > >
> > > > libhdfs3 and libyarn are also seem to be the runtime dependencies for
> > the
> > > > HAWQ, right? Which means, that if hawq is installed as a standard
> > package,
> > > > then those two (and perhaps thrift) will have to be provided as
> > packages as
> > > > well. Which makes the reliance on unknown 3rd party repos a big
> > security
> > > > no-no.
> > > >
> > > > So, any thoughts from the community on the steps to make this better
> > before
> > > > the release is out? If we're to add more source code into the project
-
> > > > this
> > > > is the perfect time to do it.
> > > >
> > > > Cos
> > > >
> > > > On Thu, Feb 11, 2016 at 12:16PM, Konstantin Boudnik wrote:
> > > > > Looking a bit further into build dependencies it seems that the
> > > > environment
> > > > > relies very heavily on some 3rd party, maintained by someone and
> > > > somewhere
> > > > > repos and libs. While it is up to the community on how they handle
> > their
> > > > > builds, adding repos and packages which aren't maintained in an open
> > > > fashion,
> > > > > won't be helping to attract new contributors. Cases in point
> > > > >
> > > > >   - https://bintray.com/wangzw/rpm/rpm
> > > > >   - http://darcs.idyll.org/~t/projects/figleaf-0.6.1.tar.gz
> > > > >   -
> > > >
> > http://sourceforge.net/projects/pychecker/files/pychecker/0.8.19/pychecker-0.8.19.tar.gz/download
> > > > >
> > > > > This certainly will be an integration blocker for the downstream
> > > > projects like
> > > > >     https://issues.apache.org/jira/browse/BIGTOP-2323
> > > > >
> > > > > One other point (which seems a bit weird). What's the point of
> > > > >     yum install -y postgresql-devel
> > > > > followed by
> > > > >     yum erase -y postgresql postgresql-libs postgresql-devel
> > > > >
> > > > > Thanks
> > > > >   Cos
> > > > >
> > > > > On Tue, Feb 09, 2016 at 06:06AM, Konstantin Boudnik wrote:
> > > > > > Gents,
> > > > > >
> > > > > > have you considered creating and checking-in a wrapper script
> > > > (run-build.sh or
> > > > > > whatever) for the build, instead of writing lengthy shell-scripts
> > in
> > > > Jenkins?
> > > > > > Then instead of
> > > > > >
> > > > > > docker run --rm=true -v `pwd`:/data -u root rlei/mydocker:latest
> > > > /bin/sh -c
> > > > > > "date; \
> > > > > >
> > > > > > cd /data....
> > > > > >
> > > > > > [another 23 lines of shell script one has to type each time.....]
> > > > > >
> > > > > > one would need to run
> > > > > >
> > > > > > docker run --rm=true -v `pwd`:/data -w /data -u root
> > > > rlei/mydocker:latest run-build.sh
> > > > > >
> > > > > > and be done with it.
> > > > > >
> > > > > > Cos
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > >
> >

Mime
View raw message