Mailing-List: contact dev-help@httpd.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@httpd.apache.org
Date: Tue, 15 Jul 2003 15:30:23 +0200 (CEST)
From: Michal Szymaniak <michal@cs.vu.nl>
To: Jeff Trawick <trawick@attglobal.net>
cc: dev@httpd.apache.org
Subject: Re: [PATCH] UDP Listeners (was Re: DNS+HTTP redirection system inside
 an Apache module)
In-Reply-To: <3F13F3CD.3070409@attglobal.net>
Message-ID: <Pine.GSO.4.53.0307151435240.7744@flits.cs.vu.nl>
References: <Pine.GSO.4.53.0307142254390.25665@flits.cs.vu.nl>
 <3F137E1D.7040702@apache.org> <3F13F3CD.3070409@attglobal.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII


Jeff,

Yes, I do realize that it is not obvious how it works, especially without
reading the papers :-) I will try to clear it up a bit.. but you will need
the source code of NetAirt, as I am going to refer to it below.
(http://www.globule.org/netairt/netairt-2.0.46-0.2-thin.tgz)

> I'm at a loss as to what the big picture is with the programming model
> enabled by this path.  I could go read the whole "DNS+HTTP redirection
> system" but I'd rather not :)

:-) I guess there is not too much reading anyway: you can find technical
details in the master's thesis -- just skip all the introductory stuff.
The paper version only gives a general picture, no technical details
whatsoever.

> How does process_func field ever get set?  What type of MPM thread would
> be used to process the UDP -- the type of thread that polls on sockets
> or the type of thread that processes work.  It doesn't fit well within
> the rest of Apache to assume that the "accept"-ing thread should process
> the message.

OK, here is how it goes. Listeners are created in configuration
directive callbacks. This is where you append listeners to the global
list, and this is where you set the fields, both accept_func, and
process_func (see mod_netairt.c::dns_config_mode()). UDP listeners
are marked as 'configured' to avoid Apache messing with them,
yet at this stage there are just unconfigured sockets. Since allocation is
done in the module, I need old_listeners to be visible to re-use sockets
(see dns_comm.c::dns_comm_alloc_listener(), which is somehow very similar
to the original Apache's alloc_listener() ;-)).

After allocating, listeners survive until the end of configuration.
In the post_config hook, I call dns_config::dns_config_init() to
configure the UDP listeners (setsockopt's and binding).

What we have now, is a working UDP listener with two custom functions,
'accept' and 'process'. Since accept's are serialized, we don't want to do
any long-lasting stuff there. So, inside 'accept', we only retrieve all
immediately-available datagrams, create a fake TCP socket to be
returned (to fool some MPMs, such as worker, which tries to close the
socket in some special cases), and register the listener inside the
transaction pool, together with a list of the just-received datagrams (see
dns_comm.c::dns_comm_accept_udp()).

The fake TCP socket and the transaction pool (with our stuff hidden
inside) is passed to the connection-processing part. This is where things
go wild. I patched the Apache connection-processing function to check for
the hidden UDP listeners before creating the connection record. If found,
the customized 'process' function is called, and the connection processing
is silently aborted afterwards.

The 'process' function destroys the fake TCP socket, digs up the hidden
datagram list and processes them (see dns_comm.c::dns_comm_process_udp()).
It happens in the thread that called the connection-processing function,
so I think that it is the right one. At least, it works fine for both
prefork and worker. After datagrams are processed, and responses sent
(using the same UDP socket, with a mutex for correctness), the UDP
listener and the datagram list are  unregistered from the transaction
pool, as these pools are usually reused. Voila.

> Why does ap_old_listeners need to be externalized?

For reusing by the module-private listener allocator (see above).

> About this code below: It seems like process_func is a flag that means
> ????? (I dunno; why wouldn't we be using the existing record here).  Is
> that inherently connected to having a process_func, or are those
> separate issues?
>
>            /* Some listeners are not real so they will not have a
> bind_addr. */
>            if (sa) {
>                apr_sockaddr_port_get(&oldport, sa);
> !             if (!strcmp(sa->hostname, addr) && port == oldport &&
> !(*walk)->process_func) {
>                    /* re-use existing record */
>                    new = *walk;
>                    *walk = new->next;

When writing NetAirt, I assumed that there will be only my UDP listeners
inside, all using 'process' functions. Since you can have 2 listeners
operating for the same host:port pair (1 TCP, and 1 UDP), I had to
distinguish them somehow, so I used the process_function nobody else could
use. I agree that it is not very clean solution -- it simply assumes that
process_func are only defined for UDP sockets.

> Regarding the code below:  Setting a key in a pool is not an appropriate
> way to get Apache to do some special processing on a socket.  Surely the
> problem is that Apache doesn't have quite the right hook currently and
> an existing hook needs to be modified in Apache 2.1-dev or some new hook
> is needed?
>
>    {
>        apr_status_t rv;
> !     conn_rec *c;
> !     ap_listen_rec * lr = NULL;
> !
> !     rv = apr_pool_userdata_get((void**)&lr,AP_LISTEN_REC_KEY,ptrans);
> !     if (rv == APR_SUCCESS && lr && lr->process_func) {
> !         apr_pool_userdata_set(NULL,AP_LISTEN_REC_KEY,NULL,ptrans);
> !         lr->process_func(csd,lr,server,ptrans);
> !         return NULL;
> !     }

Of course, adding a new hook at this stage would be a much more clean
solution. Somehow I could not figure out how to do that myself, so I
came up with hacking the connection-processing function. And if you
implement it as a new hook, could you also think about a better way
of passing data from 'accept' to that new hook? You are right that
using pools as bags is not the most elegant thing to do :-)

All this shows that running UDP in Apache can make sense, and I tried
to get some attention about it one year ago.. But somehow everybody
thought that UDP inside Apache can be used only for HTTP-over-UDP,
which is indeed controversial.. I hope that full UDP-support will not
be neglected in the next Apache release :-)

Regards,
M.
--
Michal Szymaniak | mailto:michal@cs.vu.nl | http://www.cs.vu.nl/~mszyman