httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "William A. Rowe, Jr." <wr...@lnd.com>
Subject RE: ap_cleanup_fn_t
Date Tue, 04 Apr 2000 04:50:29 GMT
> From: Greg Stein [mailto:gstein@lyra.org]
> Sent: Monday, April 03, 2000 9:32 PM
>
> I visited Covalent last week, and spoke with Ryan, Daniel,
> and Doug about
> configuration stuff and ways to leverage those changes into a working
> mod_info. Sure, it doesn't work today, but it will work fine in 2.0

I understand mod_info will be a whole new beast, and I'm looking forward to
what they are putting forth !

> Please consider this message a
> discussion, rather than a knock against your efforts...

Happily

> The issue, of course, is that Bill (Stoddard) is probably a
> bit backed up on reviewing those patches (and sorting through
> your monster patches :-)

Not an issue, of course, since there is a ton of fixups to do... and Bill
has his own long list of things he is actively pursuing (the list of Apache
2.0 - 'Bill Stoddard is working on this' statuses seems to grow weekly :~)

> > Two, you are eating plenty of extra clock cycles by not
> forwarning the
> > ApacheCore that the functions reside in a DLL.  The linker
> is writing plenty
> > of fixup thunks for you.
>
> Hrm. How is this forewarning occurring? I haven't seen this.

Today?  It isn't... lol.

Works like this... c++ sees a function prototype, no declspec, so it becomes
an call inproc to the local address space.

Linker sees the dll resolution, bind the function pointer, and thunks the
call with the fixup pointer to the external module address space.
(Effectively, late binding)

__declspec(export) tells the compiler that it will be an exposed symbol,
adds it to the external declarations table, and resolves the pointer to the
local address space.  Internal calls assume local address space and don't
even dispatch.

__declspec(import) tells the compiler it will _never_ see a real function,
so it is added to the dll resolution table directly (jump indirect) into
whatever address space the loader hooks it to.  A couple of opcodes are
dropped for this optimization.  (Effectively, early binding).

> Definitely, we can improve speed by using numbers in the .def
> files, but
> what else is needed?

That's load time resolution of that address table, and yes, it's faster to
load, but doesn't affect the call/return cycle.

> > Three.... Four.... Five....
>
> Hrm. I'm having a hard time parsing through this.

Then I didn't do a good enough job explaining myself.

The public headers, apr_blah.h - are the build time resolution for the
interfaces.  And you are right, there are even tools to parse .h into VB
includes, etc...

Here's the rub.  You are assuming the user will set the compiler flags on
their project to be the same calling convention as the Apache build.  Is
this fair?

I have two modules, a and b.  a is a COM extender that lets me expose Apache
as a set of objects to script engines.  a is best built (in my project
team's estimation) using __stdcall for various reasons.  b is a raw byte
blobber that is best implemented using cdecl, since it has tons of recursive
vararg calls.

The same solution for _my_ compilation doesn't work at all for both a and b.
So... project a needs a _seperate_ set of includes for Apache's function
declarations, that explicitly state the _cdecl calling convention?  Project
b needs no such thing.

Two months later, we (or I myself) determine that Apache should be built in
__stdcall for one reason or another.  Now I'm rewriting the .h files for
project b, and project a's headers need to be scrapped for the real Apache
headers.

Makes no sense.  And doesn't solve the declspec issues either.

> One item of background: as it stands today, all modules must
> be built with
> the same set of switches as Apache.

Ok, here I _agree_ with you!  To a point.  APR is, we think, a seperate
'package'.  Everything else is destined for the core, or a dll to the core.
Because APR has public interfaces everyone needs, those should be declared
'in kind' with one another, with external or apr.h defines that throw the
switches for the external interface of the APR package.

Internal switches are great, and that will control every internal-use
function.  No doubt cdecl will probably outperform stdcall in those
scenarios, but as long as the switch is thrown for all the APR .c modules in
sync, we can experiment.

Similar goes for the ApacheCore.  If it's external, the existing (and
reworked) API_EXPORT stuff should work fine.  Again, throw the switches at
the external or httpd.h/ap_config.h level and we are off.  Throw the compile
time switches, and we optimize the internal calls.

> I'm unsure of what "wrappers" you're referring to. Are these
> wrappers that
> we supply or ones that the module develop supplies?

As I read you, I'm assuming the module developer.

> > I really foresaw Apache 2.0 offering additional flexibility
> while buying
> > optimizations that would increase our market appeal, not
> shrink them down to
> > a single inflexible configuration.  I think that was
> already done quite
> > admirably (no sarcasm or derision here) in httpd.  That
> vision for Apache
> > 2.0 is certainly of no interest to me.
>
> I understand and agree with your general sentiment. However,
> I am not sure
> what we buy if we have the ability to easily flip between calling
> conventions. As I see it, we have:
>
> *) pro: no call-convention macro means we never accidentally omit it
> *) pro: no macro keeps the source "lighter"
> *) con: module developers using languages that cannot
>         generate cdecl code must work harder
>
> Hmm. It would be nice to have a longer list there :-)

  *) pro: declarations fit on fewer source and header lines.
  *) con: fixup opcodes to relay calls from local addresses to dll
          slow the entire server.  Greater reliance on APR calls
          increase this effect from the 1.3.x family.
  *) con: less documentation between 'internal' and 'exposed' funcs.
  *) con: 3rd party modules must be compiled in the same model and
          calling convention as Apache itself, without consideration.


> I think one point that I'd like to mention is simply: providing Yet
> Another Way To Compile Apache means that we introduce more
> complexity into the config/build process (to support the increased
> flexibility). If that additional flexibility does not buy us a
> whole lot, then I'm in favor of trading it out for the resulting
> simplicity.

By retaining flexibility today, we can brick it in later.  But without
flexibility early on, we can't put it to the bench and find out what
works for the bottom line of performance.

> > The -original- problem to resolve is in fixups.  Get rid of them.
>
> Which fixups are these? The fact that we don't have ordinals
> in the .def
> file, or the linkage for the data?

See my comments above, or read the MS VC __declspec docs.

> > As for thunks, just say no :~)
>
> Note that I'm not referring to Win16 thunks :-). Gawd, you're right on
> that part... just say no.  No, I mean a simple routine such as:
>
> struct start_thread_data {
>   ap_thread_start_t func;
>   void *data;
> };
> unsigned int __stdcall thunk(void *ctx)
> {
>     struct start_thread_data *std = ctx;
>
>     /* ### what to do with the return value? */
>     (void) (*ctx->func)(data);
>
>     return 0;  /* ### what does Windows want here? */
> }
>
>
> ap_status_t ap_create_thread(...)
> {
>     struct start_thread_data std = { func, data };
>     ...
>
>     ... _beginthreadex(NULL, 0, thunk, &std, 0, &temp) ...
>
>     ...
> }
>
>
> This eliminates the nasty casting in ap_create_thread's use of
> _beginthreadex(), and it means the thread functions don't
> have to declare
> a special calling convention.

I'm not even sure this was really implemented through, and we
seem to have no consumers for the function yet, short of the
testthread code.  I don't mind seeing some cleanup here.

I've played all sorts of these games to get C callbacks
gracefully back into their C++ object, finding *this again.
I don't see how this alieviates telling consumers explicitly
how to declare their callback thread/hook/whatever function.
But it would be great to pass back useful data such as context
in a meaningful and clear way.

Tell ya what... would some assembly code help here?  Took me
a bit to work it out.  Note that the first assignments, a = 0
and b = 0 both cause a GP fault, cause they are simply wrong.

I tried to force as many examples of fixups as possible.

---
:\Devel32\dllexamples\client\test.c  --------------------------------------
1:    extern a;
2:    extern _declspec(dllexport) b;
3:    extern _declspec(dllimport) c;
4:
5:    int fn_a(int t);
6:    int __declspec(dllexport) fn_b(int t);
7:    int __declspec(dllimport) fn_c(int t);
8:
9:    int main(int argc, char**argv)
10:   {
00401010   push        ebp
00401011   mov         ebp,esp
00401013   sub         esp,10h
11:       int t1, t2, t3, t4;
12:
13:       a = 0;
00401016   mov         dword ptr [_a(0x004010b2)],0
14:       b = 0;
00401020   mov         dword ptr [_b(0x004010ac)],0
15:       c = 0;
0040102A   mov         eax,[__imp__c(0x00416290)]
0040102F   mov         dword ptr [eax],0
16:
17:       t1 = fn_a(1);
00401035   push        1
00401037   call        _fn_a(0x004010a0)
0040103C   add         esp,4
0040103F   mov         dword ptr [t1],eax
18:       t2 = fn_b(2);
00401042   push        2
00401044   call        _fn_b(0x00401084)
00401049   add         esp,4
0040104C   mov         dword ptr [t2],eax
19:       t3 = fn_c(3);
0040104F   push        3
00401051   call        dword ptr [__imp__fn_c(0x0041629c)]
00401057   add         esp,4
0040105A   mov         dword ptr [t3],eax
20:
21:       t4 = a + b + c;
0040105D   mov         ecx,dword ptr [_a(0x004010b2)]
00401063   add         ecx,dword ptr [_b(0x004010ac)]
00401069   mov         edx,dword ptr [__imp__c(0x00416290)]
0040106F   add         ecx,dword ptr [edx]
00401071   mov         dword ptr [t4],ecx
22:
23:       return (t1 + t2 + t3 - t4);
00401074   mov         eax,dword ptr [t1]
00401077   add         eax,dword ptr [t2]
0040107A   add         eax,dword ptr [t3]
0040107D   sub         eax,dword ptr [t4]
24:   }
00401080   mov         esp,ebp
00401082   pop         ebp
00401083   ret
--- No source
ile  ------------------------------------------------------------

_fn_b:
00401084   jmp         dword ptr [__imp__fn_b(0x00416298)]
_fn_c:
0040108A   jmp         dword ptr [__imp__fn_c(0x0041629c)]
_fn_a:
004010A0   jmp         dword ptr [__imp__fn_a(0x00416294)]
_c:
004010A6   jmp         dword ptr [__imp__c(0x00416290)]
_b:
004010AC   jmp         dword ptr [__imp__b(0x0041628c)]
_a:
004010B2   jmp         dword ptr [__imp__a(0x00416288)]

__imp__a:
00416288   1001EFCC
__imp__b:
0041628C   1001EFDC
__imp__c:
00416290   1001EFE0
__imp__fn_b:
00416294   10001005
__imp__fn_a:
00416298   1000100A
__imp__fn_c:
0041629C   1000100F

@ILT+0(_fn_a):
10001005   jmp         fn_a(0x10001030)
@ILT+5(_fn_b):
1000100A   jmp         fn_b(0x10001047)
@ILT+10(_fn_c):
1000100F   jmp         fn_c(0x10001065)

---
:\Devel32\dllexamples\library\example.c  ----------------------------------
1:
2:    extern int a;
3:    extern int __declspec(dllimport) b;
4:    extern int __declspec(dllexport) c;
5:
6:    int fn_a(int t) {
10001030   push        ebp
10001031   mov         ebp,esp
7:        return (a += t);
10001033   mov         eax,[_a(0x1001efcc)]
10001038   add         eax,dword ptr [t]
1000103B   mov         [_a(0x1001efcc)],eax
10001040   mov         eax,[_a(0x1001efcc)]
8:    }
10001045   pop         ebp
10001046   ret
9:
10:   int __declspec(dllexport) fn_b(int t) {
10001047   push        ebp
10001048   mov         ebp,esp
11:       return (b += t);
1000104A   mov         eax,[__imp__b(0x100196e8)]
1000104F   mov         ecx,dword ptr [eax]
10001051   add         ecx,dword ptr [t]
10001054   mov         edx,dword ptr [__imp__b(0x100196e8)]
1000105A   mov         dword ptr [edx],ecx
1000105C   mov         eax,[__imp__b(0x100196e8)]
10001061   mov         eax,dword ptr [eax]
12:   }
10001063   pop         ebp
10001064   ret
13:
14:   int __declspec(dllexport) fn_c(int t) {
10001065   push        ebp
10001066   mov         ebp,esp
15:       return (c += t);
10001068   mov         eax,[_c(0x1001efe0)]
1000106D   add         eax,dword ptr [t]
10001070   mov         [_c(0x1001efe0)],eax
10001075   mov         eax,[_c(0x1001efe0)]
16:   }
1000107A   pop         ebp
1000107B   ret

_a:
1001EFCC   00 00
1001EFCE   00 00
_b:
1001EFDC   00 00
1001EFDE   00 00
_c:
1001EFE0   00 00
1001EFE2   00 00


Mime
View raw message