perl-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kennedy <a...@phase-n.com>
Subject Assumption of guilt
Date Thu, 30 Dec 2004 08:37:08 GMT
Having read back a lot now I just thought I'd try to clear up a few 
problems regarding the PAUSE issue. Randal was fairly untactful earlier.

I'll try to lay out some background information on the core problem, as 
much for myself as for anyone on this list.

I see that everyone still considers the problem of mod_perl 2.0 
integration into the rest of perl an issue that is still unresolved. It 
would appear that for many of my questions you don't yet have answers.

 From most of stas' emails I see that he at least considers PAUSE to be 
the root problem here.

Unfortunately, it is not merely a case of PAUSE and a few CPAN clients 
needing to be changed.

Perl, at a fundamental level, has always had the assumption of a single 
API per namespace.

It isn't just PAUSE, it is assumed through all of the CPAN 
infrastructure, and every module that lives in it.

That's 8,300 packages, 40,000 individual .pm module files and 14 million 
lines of code from several thousand developers that all make that 
assumption. And that's ONLY the CPAN code. Everything that perl has ever 
done, every module and script ever written, largely make that assumption.

It's the way everything expects modules to work, and has for the last 
decade.

All the modules have linear versions. The entire perl toolchain, that's 
core stuff, third party modules, and private toolchains all over the 
world, assume this. PAUSE, CPAN itself, the two most populate clients 
(CPAN.pm and CPANPLUS), every documentation system, every custom 
downloader, everything that access the core CPAN database in some form 
or another, every source parsing or other tool somewhere down the chain, 
all of it.

The assumption is that one API exists per node in the namespace.

 From the consumer side of the API boundary the assumption is that if 
you use Foo::Bar you will always get the same API.

How that API is actually implemented is another matter, and irrelevant 
as long as the API stays the same and encapsulation is preserved.

Implementations can and has changed over time. GD (although it was a 
complete balls-up) survived the transition because while it moved behind 
the scenes from gd1 to gd2 it maintained the same API at the boundary 
within perl.

When Randal says "the world", he means every module ever written 
everywhere since the beginning of perl 5. Every script, every developer, 
every company. They all work together under the agreement that there is 
one API per package in CPAN.

Every database everywhere that stores or processes metadata on the 
modules will be considering the module names to be unique, certainly at 
an API level.

Hence the perldoc problem. perldoc and every other tool works because 
there is a one to one relationship between the namespace slot and the 
API, regardless of their actual names. They store all their data that way.

There are dozens or hundreds of systems that do this, including.

man
perldoc
cpan.org
rt.cpan.org
search.cpan.org
cpanratings.perl.org
Every documentation system
Every perl wiki that has module-keyed pages
The CPAN cross reference (in development)
The Debian package format

And those are just the common ones.

There is no way to effectively have more than one API per module name.

The effects of wanting to keep using Apache:: for a new API just keep 
cascading and cascading endlessly outwards.

There can only be one. This cannot be changed without changing 
everything. Without a complete generational change in the language, all 
the modules and all the code everywhere it just isn't possible.

Which is why people have been saying to you guys "Maybe we can do this 
for perl 6", because perl 6 DOES involve going across a generational 
boundary.

Although granted you can make the alternative modules LOAD by doing the 
@INC trick of adding Apache2/ versions of everything, that's ALL you 
achieve. And there are tons of modules that load by doing incredibly 
evil things like that, but we general put them into the Acme:: 
namespace, or whoever creates the module is responsible for doing all 
the work to ensure that it behaves exactly like everything else.

I wrote Class::Autouse like this. Under the covers it's pretty evil, and 
there are 15 lines of code to do the evil, and 350 lines of code to make 
it behave completely normally in every other regard and behave EXACTLY 
like normal perl does.

The only.pm module, which DOES allow for loading of a specific version, 
is doing so from an API security perspective. It is primarily intended 
for ensuring things like:

"I don't trust that this API won't change, so stick within this range of 
versions".

If it is being abused for different purposes, then that is not really 
it's original intent (I hope) :)

---------------------------------------------------

In general, if you change the way an API works, you cause damage. We as 
module authors build in compatibility wherever possible and try to 
maintain compatibility wherever we can to mitigate this, or if we are 
lucky hardly anyone is using the API and the damage is minor.

But regardless we are still causing damage and additional work for 
others for the changes in the API that we want.

----------------------------------------------------

Changing to an entirely different API is a problem. And what you have, 
in essence, with mod_perl 2.0 is an entirely new package/product. An 
entirely new API, but also "branded" as Apache mod_perl. It's name, like 
for Apache itself, is largely a marketing thing. Under the covers, 
everything is new and different.

You are right that the distro name itself is totally irrelevant, and you 
can call the mod_perl 2.0 distro 'boom-badda-badda-bing-5.12' for all it 
matters. Call it anything you like, as long as each file is unique.

PAUSE and CPAN are quite accommodating in that regard and the distro 
naming is mainly just a convention.

The mod_perl 2 API is a totally different beast to the mod_perl 1 API. 
They just happen to do something similar and be "branded" under the same 
name.

What p5p and randal and others have been trying to convey is that you 
will be unable to avoid causing MASSIVE damage by ripping out an API 
from the namespace that it has existed at for many many years and 
replacing it with a different API, even if you are using the same 
marketing brand for the new API.

What it is called, or what it actually does, or how PAUSE works, or how 
big or small the new and old API is, or the fact it is being done by The 
Apache Foundation are all irrelevant.

In an environment where there is an enforced and unchangeable one-to-one 
relationship between module name and API there is simply no way around 
this without changing the fundamental assumptions of the language.

And this means a fork.

I note that you are actually starting to do something that looks 
suspiciously like a fork already.

You are accumulating parallel mod_perl 2 versions of various standard 
utilities.

You have your own new mp2 version of perldoc, your own mp2 version of 
the cpan client on the way, and an entirely new way of loading perl 
modules. You have an alternative set of install instructions than the 
norm. You appear to have your own module builder and installer, and I 
think, nay, I know these workaround will just keep increasing and 
increasing in number.

I fear in your search for workarounds to accommodate the attempt to 
replace the API without actually replacing it, you will have to fork the 
language and provide your own version of perl, and your own version of 
CPAN and rewrite the entire tool chain.

Nobody that I can recall has EVER tried to replace an API so large and 
simply rip out the old one with no access to the previous version and 
any change of them co-existing on the same system, although your module 
loading workaround do appear to be sufficient to make the thing actually 
load and run.

Generational API changes will always have difficulty fitting into a 
stable underlying platform that cannot support parallel APIs within the 
same namespace, which is the reason why people do things like SQLite2 or 
imlib2 or what have you. Because it is the only practical way without 
changing the entire environment and everything else in it.

When someone earlier referred to the "stability" of mod_perl, what they 
meant was API stability. That when you call Foo::Bar->baz, the method 
will actually exist and do the same thing that it did when you wrote it.

It might be a little uncomfortable, but I fear you may have no other 
choice other than do use Apache2:: or some other namespace.

You talked a little earlier about just not indexing it and letting 
people install it manually. For how long? I find as a general rule for 
every year that an API has been in existence you need to support it for 
another year.

For mod_perl, I think a decade is probably a reasonable time frame in 
which to discontinue it's use entirely. It's about the standard time 
period that governments insist on support for, so it's probably a fairly 
reasonable length of time. There are billions of dollars invested in 
large mod_perl based systems, and these need to be supported, if even 
only with security patches, for at least a decade.

If you move mainline to mod_perl 2, and in 5 years a critical security 
bug is uncovered, what is the German Tax Office (to pick someone at 
random for whom I have no idea about) to do? Request an emergency $20 
million amount to port their tax system to mod_perl2?

Not everyone cares about the new speed.

Make it work. Then make it maintainable. Then make it fast.

-----------------------------------------------

Regardless, because of all of this I fear you are eventually going to 
have to capitulate to what I'm sure others have suggested and move to 
using Apache2 or ModPerl or some other namespace other than Apache:: for 
your new API.

I'm certainly not the one to insist you do it. I'm currently on a grant 
to complete something that has been considered impossible until now, so 
I'm hardly the person to throw stones.

You are the one that will ultimately have to make the decision to move 
to Apache2:: or ModPerl::

But unless the expense of making the change to Apache2:: or some other 
namespace is higher than the expense of forking perl, I really feel that 
ultimately you have no other choice.

The alternative is probably to wait until a generation change in perl 
itself (perl 6). And as almighty as the Apache behemoth is, and The 
Apache Foundation are, you are pushing against a fundamental core 
assumption of the language, and I'm sure The Apache Foundation doesn't 
want to do a hostile takeover of The Perl Foundation and fork an entire 
language.

I'm a big believer in responsibility and in cleaning up your own mess. 
In my company, the style guide states that anyone wishing to change the 
style guide is responsible for ensuring that change is made throughout 
all code, by hand is necessary, so that they fully understand the 
implications of what they may have thought was a simple decisions, that 
had cascading and far-reaching consequences.\

Perl itself works like this as well. If _you_ want to be the first major 
subsystem ever to leverage multiple APIs into the same module name and 
make it work properly, _you_ are the one that is going to have to do all 
of the work. You have to rewrite PAUSE, patch CPAN.pm and CPANPLUS, work 
out all of the problem and if needed fork the language.

-----------------------------------------------

I'm also a believer in not bitching unless you can provide an 
alternative solution, and you DO have a few options here.

Options that don't ultimately involve forking perl.

Here's an outline of how _I_ would solve the problem if I were you.

-----------------------------------------------

When it comes to the core modules, you probably don't have any other 
choice than to put your new API into a new namespace slot. One that is 
different but still keeps the Apache "brand".

Whether this is Apache2:: or some other name is your call, it's your 
baby and you get to name it.

Anyone writing completely new software or that wants the additional 
speed can do what they normally do and port to the new API at the new name.

I noticed you have already implemented a compatibility layer. This is 
great, as it will reduce your burden markedly.

What you can do with this is to use the compatibility later to implement 
Apache::. I assume you have done this accurately enough for it to be 
functionally identical to mod_perl 1 API.

If someone wants to move their old application to a host with mod_perl2 
installed, they can run it as normal and it will work unchanged. It 
might be a little slower than it _could_ be if they ported it to the new 
API, but it will still work without having to change anything.

As for Apache::* third party module authors, they can implement in two 
different ways.

The purists way would be to port across to the new Apache2:: API.

This gives them the opportunity (in their own time) to change THEIR API 
as well if they wish to take advantage of the new Apache 2.

Under the current model they have to tow the line and port while not 
having the luxury of redesign that you do and having to kludge an 
interface to your new code, or they have change THEIR API too and break 
all THEIR user's code.

If they wish, they can maintain the old version, provide an adapter to 
the new version, or even do something smarter which would detect the 
omnipresent $ENV{MOD_PERL} variable and split to the old or new code 
accordingly.

So things like Apache::MP3 could work on either of the two, and will 
detect when they are loaded. Or they could write a new and improved 
Apache2::MP3. The most important thing is that they can do it in their 
own time, and in their own way.

Let me point out that File::Remove took 6 years before someone picked it 
up again 3 or 4 months ago and started improving it again. But any code 
written 7 years ago STILL WORKS, even though the new code is very very 
different and File::Spec based with support for the local OS "recycle bin".

This is Perl, we have a lot of old code and old authors, and they may 
not play according to your timetable. :)

And in the case of Iain Truskett, it's the one year anniversary of his 
death today. He most certainly will not be able to meet your schedule. 
(and I don't use him lightly, I was the last perl person to see him alive).

And you can't move 14 million lines of code around overnight, not to 
mention the rest of the non-CPAN world.

So anyways, I hope this provides some reference information for you.

Ultimately it's your choice, but I think in it's current form it is a 
highly dangerous thing to release.

Adam

P.S. I'm sorry I haven't been able to join in earlier, but when I asked 
about mod_perl 2 I was assured someone else was taking care of it and it 
would all be alright.

I hope I can help you get things sorted now.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Mime
View raw message