cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <stef...@apache.org>
Subject Is Cocoon going to be harmful for XML?
Date Wed, 10 May 2000 14:46:48 GMT
I know that you may be shocked that I write a title like this, but I
would like to invite you to step back for a second and reconsider the
whole thing in a couple of years from now.

Cocoon has been one of the first projects to use XML and XSL on the
server side to provide web content but started as a way to learn XSL
without having to touch a shell to restart the processor everytime
something changed. That was it, nothing more.

Then more and more of you started to get interested and Cocoon grew were
it is now, with great achievements and great plans for its future.

But I have a question that wonders around my head: isn't Cocoon a hack?
aren't hacks dangerous in the long term?

A perfectly legal HTML markup was broken by Mosaic supporting the
"unclosed" <img> tag. When the web architects understood the problem
this created, it was too late to change it back.

Could Cocoon incorporate such a nasty paradigm mistake and create harm
later down the road?

I believe it could be so and I would like to express my vision in order
to find a solution and outline guidelines in order to be
"future-compatible" and avoid ruining the yet-to-be web 2.0 (also known
as "web of knowledge").

                  ------------------- o ------------------

While server side transformations allow new formats to be "adapted" to
older clients that do not support these formats, there is a great risk
that, having this type of transformation capabilities on the server
side, there will be less "pulsion" to the creation of XML-capable
clients.

Also, it is very likely that in a few years, the percentage of http
requests made from embedded devices will skyrocket, expecially with the
advent of 3G cell phones (a.k.a. UMTS) which allow per-traffic fees,
rather than current digital wireless networks which only allow per-time
fees.

So, given the incredible number of HTTP-requesting clients and different
software implementations, the need for "server side content adaptation"
will be _extremely_ important. Cocoon2 will handle all the requirement
of any type of device, since all new formats will be very likely XML
based in the future (also because XML allows better compressibility of
streams, given the high contextual information available to
entropy-estimating compressors, see XMill).

Also, the web moved from a network of homepages to clustered information
systems. This trend will continue even further in the future. Why?
because only a high traffic can sustain the micropayment process that
drives web sites with banners and advertisments.

The original idea of the web was to place content close to the content
owner, allowing him to write it and maintain it, thus removing the
scalability saturation of centralized information systems.

The Cocoon project (and my college thesis) would like to propose an
alternative way of removing scalability saturation by using the concept
of "separation of concerns", thus allowing both little-granular
highly-distributed information systems and centralized ones to be
successful on a scalability base.

On the other hand, there is a big danger in this: scalability saturation
kept the web distributed, thus allowing it to be more free and more
liberal for both social and economic reasons.

What happens if this limitation is removed and sites can grow linearly
with the resources involved, but acquire payments in more than that
rate?

Would the web become another oligarchy, just like broadcasting channels
(Radio/TV)?

True, you say, there is a significant difference: the energy required to
setup your own TV broadcasting station or Radio station is incredibly
higher than the one required to start your own web site. And this
freedom will alsways allow people from all over the world to start their
own business or publish their political manifests, their son's pictures,
their homevideos.

Is there a way the web oligarchy can play unfair against this freedom?

Well, they can try legally, but there is no world government and very
unlikely there will be one in the near future so application of this law
would be impossible and would rather depress the economy of the strict
while boost the economy of the more liberal.

They could try technically, but TCP/IP was designed to route around such
things and HTTP has no knowledge of restrictions.

They could try to ruin the protocols, but open source projects can't be
easily stopped or shut down (there is a chance to be discredited, say,
by hacking the site and adding a backdoor in the compiler that builds
the Apache distribution or the Bind distribution. This is why the ASF is
so paranoid about security and this intrusion created lots of fear in
us)

Or, more probably, they could make them paria and create two networks:
the network of the rich and the network of the poor.

                  ------------------- o ------------------

Is XML going to be the network of the rich? No, because both the
knowledge and the software to create it are as free as air, but problems
start to raise if big sites start to exchange information using
proprietary DTDs, start using RDF-based search engines and so forth.

They create an XML infrastructure that is so complex and proprietary
that all the others can't cope with.

Then, while people can use fine tuned research on this awesome XML/RDF
based search-engine, all the rest is left struggling in tons of
data-garbage that comes out of Altavista.

Is this Science Fiction or something that could happen? answer yourself.

can Cocoon help to avoid something like this? my answer is: yes, indeed.

can Cocoon help to generate something like this? I believe:
unfortunately, yes.

why?

well, going back to technical details, the Cocoon sitemap maps resources
to pipelines that generate them. There is substantial symmetry between
any type of content generated by this resource. This means that it makes
no difference for Cocoon if you are sending "RDF poor markup" or "RDF
rich markup"... this means that site crawlers don't have a standard way
to obtain information on your site or on your pages... in general to the
resources that the Cocoon-powered web site contains.

This is, IMO, the key point of failure: while Cocoon enforces the use of
XML locally, does not enforce nor suggest the use of XML globally.

I'm not sure how this can be done and I'm not sure if this can be done
at all!

But I think that while Cocoon is very well designed around XML,
namespaces and XSLT, RDF remains on the side, treated like any other
markup.

I think we should try to place RDF inside the picture so that RDF-based
crawlers will have a reason to exist when enough Cocoon-published
content will be available.

It might not happen soon, but I don't want to feel guilty in ten years
from now if the web created isles of knowledge and left individuals out
because of we not providing them the tools to survive the technical
challenge.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<stefano@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------
 Missed us in Orlando? Make it up with ApacheCON Europe in London!
------------------------- http://ApacheCon.Com ---------------------


Mime
View raw message