xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Dierken <m...@DataChannel.com>
Subject RE: The need for a site-wide XML-based information system
Date Sun, 21 Nov 1999 20:55:50 GMT
> Information system pose the real scalability problem in current open
> source projects. Document writes should have the lowest 
> possible energy
> gap to be full speed in minutes rather than days. Also, the 
> project must
> be able to scale as more people work on the documentation.
> 
> Also, good integration with services (webCVS, bugtracking, 
> todolists) is
> of incredible importance.

Sounds good. Also sounds like an 'information portal' and an 'application
portal'. (Sorry for the buzzphrases.)
'Information portal' being an organized heirarchy of readable/writable
content with personalized presentation and 'application portal' being a
secure, single sign-on, manageable type of system. In the end, they are the
same, but when you have 'legacy' Web based apps, integrating them without
re-writing them poses its own set of problems.

Let's toss up some requirements & create a straw-man proposal of the system
diagram (like a WebDAV server, with a security proxy in front of everything
and XSL in there somewhere...). How many hits is xml.apache.org getting? how
about www.apache.org?


>  <p>Here is the <link xlink:href="manual.xml" 
>                       xlink:type="simple" 
>                       xlink:mode="soft">user manual</link>
> 
> where the "xlink:mode" attributed was invented here to allow XLink
> interpreters to distinguish between hard links (where href 
> should not be
> touched) and soft links (where href can be considered not as a URI but
> as a key into a URI map)

I like this 'hard/soft' mode on an XML tagged link. After reading this I
translated it into 'presentation' link and 'data' link. Data links are easy,
because they are mixed in with other data. Presentation links are hard
because the data will never know what presentation format it might end up
in, so instructing the processor to blindly jam in stuff can get messy.
That's one reason why HTML was so quickly adopted - you could do 'includes'
without worrying about presentation because there was only one presentation
format - HTML. The 'right' approach of separating data from presentation
didn't work well & the 'wrong' approach worked because it was a simplified
system.


Mike


> -----Original Message-----
> From: Stefano Mazzocchi [mailto:stefano@apache.org]
> Sent: Saturday, November 20, 1999 4:40 PM
> To: Apache XML
> Cc: Jon Stevens; James Davidson; Brian Behlendorf
> Subject: The need for a site-wide XML-based information system
> 
> 
> Hi,
> 
> I'm currently in the dirty process of moving Cocoon from 
> java.apache to
> xml.apache. Since both projects use the same approach for code
> revisioning (CVS) that transition was not painful.
> 
> On the other hand, the documentation patterns that java.apache adopts
> site-wide (which I personally designed) is based on simple yet very
> effective contracts that helped reducing the overhead of site 
> management
> as well as didn't hurt the scalability of the project as a whole.
> 
> I would like to show you how this works today in java.apache 
> and I would
> also like to show you the reasons that broght me to write 
> Cocoon in the
> first place, because, yes, Cocoon was designed as a 
> management tool for
> the java.apache web site and grew up to become a collection of new
> publishing patterns and ideas. Still, most of my experience comes in
> handling such a centralized documentation system with distributed
> authoring.
> 
> java.apache.org
> ---------------
> 
> The java.apache.org web site is composed by
> 
> 1) a graphical and architectural framework of HTML documents, and
> building scripts (in their own CVS module)
> 
> 2) the HTML documents contained in the /docs directory of every hosted
> CVS module
> 
> The idea is rather simple: each project works independently 
> on their own
> documents using HTML and whatever style they like (a very simple look
> and feel was designed as a guideline but was not mandated). These
> documents are the same distributed with the software.
> 
> When required, the scripts are manually executed on the 
> hosting machine
> and do a bunch of "cvs update" on the site and recreate the directory
> structure which is something like this
> 
> /                <--- index.html and style frames
> /main            <--- site own documents (like news, TOC, etc.)
> /images          <--- site-wide images
> /<project>       <--- each project has its own directory
> /<project>/dist  <--- each project has its own distribution section
> 
> The scripts updates the site by (pseudo-shell code)
> 
>  cvs update site
>  for each project
>   cd project
>   cvs update project/docs
>  rof
> 
> problems with this approach
> ---------------------------
> 
> The first problems were due to my own esthetic needs: I came 
> to know the
> web from a graphic designer perspective and I think a web 
> site (like any
> other GUI) must be appealing to be functional but should also be
> carefully tuned for speed and usability.
> 
> A lot of effort was put in making java.apache both appealing, usable,
> easy to manage and capable of scaling. For this reason, the use of
> frames allowed us to reduce style and linking contracts between pages
> and HTML authors without requiring special template systems (like
> Jakarta does, for example).
> 
> On the other hand, the use of a common look and feel required too much
> graphical knowledge and too much overhead to be 
> auto-mantainable in the
> long term. So, everyone used very basic HTML tags that were 
> codified as
> the look and feel, almost as a style-pruned XML-ish HTML.
> 
> This system was designed 11 months ago and right after finishing it, I
> started to write Cocoon as a way to move out of these style problems.
> Pierpaolo's work on stylebook comes right after these conclusions that
> we were sharing for the first months of Cocoon very basic operation
> aiming at batch site generation while I was more tempted by XML-based
> live web applications.
> 
> Moving to XML
> -------------
> 
> Once you get the components done, doing an XML paradigm shift 
> should be
> piece of cake. Wrong.
> 
> You know that XML is nothing without a DTD and a DTD is 
> useless without
> an application that "recognizes it" or a transformation-sheet that
> changes it into something that an application understands.
> 
> So, in order to XML-ize project docs, you need a DTD, hopefully, a
> site-wide DTD so that stylesheets can be reused between 
> projects and all
> docs come to have the same look and feel. Currently, the 
> xml.apache.org
> web site is created using DTD that Pier defined which are 
> very basic but
> farely complete in a software documentation sense.
> 
> As you can see from xml.apache.org, the results can be rather
> impressive, yet simple and straighforward to maintain for 
> non-graphical
> people and content owners.
> 
> The need for site-wide DTDs
> ---------------------------
> 
> DTDs should not change frequently and back-incompatible changes should
> be reduced to null. Still, there must be a place where DTD changes are
> discussed, voted and approuved. It must also be imposed on a site-wide
> level the adoption of particular DTDs for documentation writing.
> 
> The problem of hard hyperlinks
> ------------------------------
> 
> Hyperlinks pose a significant problem. Suppose you have an 
> XML fragment
> like this
> 
>   <p>Here is the <link href="manual.xml">user manual</link></p>
> 
> it clearly indicates that "user manual" should be connected to the
> "manual.xml" file. On the other hand, the manual.xml file could have
> been transformed into "manual.html" to be served as "text/html" by the
> web server.
> 
> It is clear that the use of HTML-style hyperlinking is not enough to
> handle XML documentation. Unfortunately, the XLink spec is not enough
> powerful to handle even these cases since all hyperlinks are 
> considered
> "hard" and immutable. Here is how I would do it:
> 
>  <p>Here is the <link xlink:href="manual.xml" 
>                       xlink:type="simple" 
>                       xlink:mode="soft">user manual</link>
> 
> where the "xlink:mode" attributed was invented here to allow XLink
> interpreters to distinguish between hard links (where href 
> should not be
> touched) and soft links (where href can be considered not as a URI but
> as a key into a URI map)
> 
> This soft mode will allow document processors to "rewire" the 
> documents
> based on some site map that is also used to create the documents. Of
> course, this attribued does not make sense on the client side 
> since all
> client-side-interpreted links must be considered "hard".
> 
> Moving Javadoc into XML
> -----------------------
> 
> The Cocoon project is currently working on writing a JavaDOC DTD and
> implementing an XML doclet to allow javadocs to be generated using XML
> and without containing any style information.
> 
> This will allow inlined-code documentation to be XML and XSL processed
> for styling, filtering, transformation or even more complex operation.
> The plan is also to allow the inclusion of syntax highlighted source
> code inside the javadocs to create a sort of "annotated code" with
> highly visual appeal.
> 
> Also, it would be interesting to estimate the use of javadoc->XMI
> transformations for direct UML-like diagrams, but this is not in our
> plans at this point.
> 
> Conclusions
> -----------
> 
> Information system pose the real scalability problem in current open
> source projects. Document writes should have the lowest 
> possible energy
> gap to be full speed in minutes rather than days. Also, the 
> project must
> be able to scale as more people work on the documentation.
> 
> Also, good integration with services (webCVS, bugtracking, 
> todolists) is
> of incredible importance.
> 
> While other projects are creating the machinery and the needed web
> applications (Brian's Tigris, Jon's Jyve), I would like to 
> see this site
> showing the power of XML for scalable web based information systems.
> 
> Along, of course, with the plan of integrating Stylebook batch
> functionality with next generation of Cocoon.
> 
> So, please, let's discuss the items here so that we can start creating
> and proposing those patterns that allow projects to operate coherently
> between them.
> 
> Sorry for the long letter, but this is a really important 
> aspect of our
> work and, IMO, deserves such a long note.
> 
> -- 
> Stefano Mazzocchi      One must still have chaos in oneself to be
>                           able to give birth to a dancing star.
> <stefano@apache.org>                             Friedrich Nietzsche
> 
> 

Mime
View raw message