httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Graham Dumpleton <grah...@apache.org>
Subject Re: commercial support
Date Mon, 24 Nov 2014 00:57:13 GMT
On 24 November 2014 at 04:59, Jeff Trawick <trawick@gmail.com> wrote:

>
> If you're doing Python web apps it would be cool to "pip install httpd
> FRAMEWORK-httpd-wiring" and have a command that wires it up based on
> framework settings and a bit of other declarative configuration.  (similar
> for other ecosystems with a packaging/build infrastructure)  mod_wsgi
> actually has a version in PyPI that works like this, although it doesn't
> bring httpd with it.
>

Downloading and compiling the whole of httpd as a side effect of doing a
Python pip install isn't really practical. The process would just take too
long for a start, plus it doesn't solve the problem that many systems will
not have the dependencies installed in order to compile it. You don't want
to have to also be separately downloading and compiling APR, APU-UTIL and
PCRE now that they aren't bundled with the Apache source code.

I have tried going that path, albeit not triggered by pip, in trying to
create a build pack for Heroku which could be used to bring mod_wsgi to
that PaaS and it was a right pain, especially since the resulting size of
all the compiled components would chew up a significant part of the image
slug allowance that Heroku gave you. In the end I gave up on it because it
was so customised and unsupported by Heroku that no one would be likely to
use it.

So for the pip installable mod_wsgi it does at least rely on you having
httpd and the httpd-dev packages, plus any dependencies for those installed.

This still doesn't help with PaaS services which have such a narrow view of
what they want to allow you to do. For example, Heroku will not provide the
httpd and httpd-dev packages in the operating system image they use to
allow people to run it using their own custom configurations and compile
and use their own Apache modules. It even took me a couple of years at
least to get Heroku to update their Python installations so they provided
shared libraries and so allow any sort of dynamically loaded embedded
system such as the mod_wsgi module inside of Apache to be able to use their
Python installations. Before that I would have to also compile Python
source code from scratch as well.

Heroku isn't the only PaaS who has gone down a path which makes it near on
impossible to use them with Apache and a customised setup. OpenShift does
actually provide an Apache/Python/mod_wsgi cartridge, but they hardwire the
Apache configuration and you cannot change it. The particular configuration
actually has various problems in the way it is done and so provides a sub
optimal experience. They also use a very old mod_wsgi version which RHEL
version they use ships. Even if you could get around that you can't change
the Apache configuration and not even the startup command, it isn't even
possible to build an Apache module from scratch as they don't install the
httpd-dev package for RHEL.

The only PaaS where I could do what I want and use the pip installable
mod_wsgi was dotcloud. This as because it was what became docker and so
allowed a user to install the missing httpd-dev package in your own space
and so it was possible to then actually compile custom Apache modules.

So for me and turning around the rapid decline in mod_wsgi usage caused by
the narrow options most PaaS providers give you, docker is definitely the
way forward.

The idea of a pip installable mod_wsgi is therefore two fold.

The first is to work around the fact that Linux distributions ship very out
of date versions of packages. Most Linux distributions are over a dozen
releases behind on mod_wsgi.

The second is that the pip installable mod_wsgi does more than just compile
the mod_wsgi Apache module. It also installs a script called
mod_wsgi-express that automatically generates an Apache configuration for
you which is setup properly for mod_wsgi. This is what Jeff is alluding to
in saying 'a command that wires it up based on framework settings and a bit
of other declarative configuration'.

This solves another serious problem that mod_wsgi has had over the years.
That is that the default Apache configuration isn't particularly
appropriate. This is especially the case for prefork MPM where Python code
is run in embedded mode inside of the Apache child work process rather than
in mod_wsgi daemon mode, whereby the Python code runs in separate
processes. This isn't aided by what I would argue as being a somewhat
flawed child worker dynamic scaling algorithm in Apache which causes too
much process churn, negatively affecting embedded systems which have a
large startup cost.

So what mod_wsgi-express does is provide a turn key solution for setting up
Apache with Python as a form of appliance which is going to suit the
majority of cases where users are just running a single Python web
application. I can take all the knowledge I have accumulated over the years
as to what is the best way of setting up Apache for Python web applications
to avoid problems and distil that in a custom streamlined Apache
configuration, that even though it can still require some minor tuning to
match your specific Python web application, does all the core setup that
most people wouldn't even do for Python.

To that end I am trying to combat the perception that Apache is slow and
bloated for Python web applications, when in fact it is usually because
they are using an old Apache and never set it up properly, by providing a
best of class configuration for that use case.

Where does docker fit into this?

For mod_wsgi at least, docker means I can provide my own base docker image
which has Apache 2.4 and latest mod_wsgi version installed. A user would
then simply create their own docker image deriving from that, which adds in
their Python web application code. For the simplest case, using some of the
ONBUILD features of docker, the Dockerfile for their project could be one
line. Add a second line if they want to override the number of
processes/threads used for the Python web application to deal with whatever
throughput requirement they have. The mod_wsgi-express script will deal
with everything else for them when generating the configuration.

My view of how perhaps docker should be harnessed, is therefore not to try
and provide one docker image for Apache that just gives you a generic
Apache configuration and then you just leave it all up to the user as has
been done in the past. Instead create specialised docker images for using
Apache in certain roles. Provide a much more minimal interface for
customising the configuration where the build of the Apache configuration
is generated by a script which has been written by someone who actually
understands how to setup Apache for that use case and so streamline it and
make it run at its best for that narrow use case that that docker image
using Apache was intended for.

I am not actually far off being able to offer this docker appliance image
for mod_wsgi and Python web applications. Mostly a case of just finding the
time to work out all the requirements to get it up on the docker hub. Am
also trying to sort out issues with the official docker Python image, which
like Heroku has made the mistake of not shipping shared libraries. So right
now I can't base off their image. The other option was to base it odd the
official docker image for httpd. For that image though, they have made it
useless to people who want to use it as the base for when building other
Apache modules as they strip out at the end all the dependencies that were
originally required to actually build Apache and any additional modules.

These latter issues with the official docker images for both httpd and
Python shows another problem, quite similar to some of the things I have
seen with PaaS providers. That is that the people implementing those
systems aren't strong Apache users themselves. They therefore create
systems which they think are going to be suitable for a wide range of use
cases, where as in fact they are only suitable for very narrow use cases
and perform poorly in other cases. We need to do a better job of
highlighting that such solutions aren't actually workable and explain why.
If we don't say anything, then they aren't changed and people continue to
have a bad experience of using Apache when it isn't Apache's fault but how
the provider set it up.

Anyway, enough rambling.

Graham

Mime
View raw message