Return-Path: Mailing-List: contact cocoon-dev-help@xml.apache.org; run by ezmlm Delivered-To: mailing list cocoon-dev@xml.apache.org Received: (qmail 36996 invoked from network); 20 Jun 2000 13:55:52 -0000 Received: from systemy.systemy.it (194.20.140.20) by locus.apache.org with SMTP; 20 Jun 2000 13:55:52 -0000 Received: from apache.org (pv50-pri.systemy.it [194.21.255.50]) by systemy.systemy.it (8.8.8/8.8.8) with ESMTP id NAA09436 for ; Tue, 20 Jun 2000 13:55:43 GMT Message-ID: <394F7705.797E90CB@apache.org> Date: Tue, 20 Jun 2000 15:52:05 +0200 From: Stefano Mazzocchi Organization: Apache Software Foundation X-Mailer: Mozilla 4.72 [en] (Windows NT 5.0; I) X-Accept-Language: en,it MIME-Version: 1.0 To: Cocoon Subject: [C2] Sitemap revised again Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Rating: locus.apache.org 1.6.2 0/1000/N You people are going to hate me for this, but I think I solved the problems in the current sitemap and after careful thinking, there are a few things to change again. 1) resource loading model We want sitemaps to be cascaded. This is a fundamental feature for site management scalability. We also know documents and sitemaps can be stored in very different locations, which range from file systems, web servers, ftp servers, compressed archives, CVS servers, XML databases. Careful: resource loading is _not_ the equivalent of "Generator" modularity. A resource is a stream of chars that happens to be xml well-formed. A generator is an adaptor between something and a SAX event. A "parser" is a specific generator that does XML parsing, but using resource loading abstraction, we are able to use the exact same code to load a document from the file system, a remote (dynamic) URI or even a CVS server. Note: in some cases, the generators might be able to skip the parsing stage, thus requiring a special generation logic to hook to SAX-aware output events from XML storage repositories... for example Prowler might be able to generate directly SAX events in response of an XPath query without the need for Xml serialization and parsing. Anyway, today we follow the namespace pattern indicatest that the "jar" protocol should be used to get the resource. Now, I propose to use the java.net.URL method to do this. Why? mainly to allow resource loading abstraction to be independent of the attribute/element schema, for example allowing some RDF-equivalent syntax such as what allows to use resource loading abstraction also for components such as 2) RDF model for attributes the element has a special meaning, just like some RDF elements. will be equivalent to this allows you to use whatever verbosity you like. [XXX: should we use RDF directly?] 3) Matchers and Choosers Ok, this is hard and tricky, so stick with me and don't loose yourself in the declarative forest. We agreed the sitemap needs a complete boolean conditional model. This was identified into following XSLT's. The natural problem is: what do we put in the "test" attribute? XSLT places XPath. Always and only XPath. Note: XPath is not extensible, only XSLT is. (in fact, both XQL and XPointer can be seen as XPath extentions, but each extention requires a new specification since XPath doesn't describe an extensible model) The use of a special "XPath"-equivalent for sitemaps was proposed by Donald. While I'm not against it in principle, I find it too weak for the planned needs. We both failed to see the sitemap model is already powerful enough to make both sides happy: ... or while "chooser" pluggability allows you to do whatever you want with the "test" attribute. So, the interface for Chooser becomes public interface Chooser implements Component { public boolean evaluate(String test, ...); } where "..." identifies all the objects the chooser will need to evaluate the choice. Ok, you say, but this is going to be slow! Right, so here we keep going public interface CompiledChooser implements Component { public boolean evaluate(...); } then public interface ChooserFactory { public String generateCode(String test); } which allows us to compile choosers into classes that are indexed by the "test" string hash and executed to avoid runtime parsing of the "test" string. (of course, this is required only for very complex operation like Donald's XPath alternative. So far for the "Chooser" part. There were naming discussions between "Choosers" and "Matchers"... I think there is no need for this: they are _different_ things. Different models. Let's see why: in the original sitemap proposal Pier and I wrote we had this follows a declarative matching model, just like xsl:template does. But it adds variable percolation of the URI tokens generated by the wildcard pattern. This has no equivalent in XSLT. (xsl:value-of is similar but not equal to this and much more verbose and general). It was suggested that using uri-based declaration may be limiting. At the same time, a better conditional model was asked for. I merged the two since it seemed to be the good thing to do. The problem is that choosers respond with booleans, matchers response with maps. Moreover, the nice thing about the xslt declarative model (which allows very nice work parallelization even inside the same sitemap) was removed with a more procedural view of nested elements. So, I propose to introduce -both- Chooosers and Matchers, the first responsible to decide if the condition is satistifed or not at runtime, the second to understand if the current status "matches" a given pattern and, if yes, the pattern is used to percolate information thru the pipeline. [NOTE: when the "type" of the component is not specified a default component will be used, the section will allow each category to define its default value that can be omitted to reduce verbosity] but also allows weird things like ... (not that I suggest to do this, but it proves the concept) Unlike xsl:templates, matchers can be nested ... the matcher interface is similar to the Chooser one but different public interface Matcher implements Component { public Map match(String pattern, ...); } and can follow the same compilable model for Choosers public interface CompiledMatcher implements Component { public Map match(...); } and public interface MatcherFactory { public String generateCode(String pattern); } The two models, just like it happens for XSLT, give you enough flexibility to perform whatever conditional sequence you need, and given you complete programmability thru the use of pluggable conditional components and don't pose a severe performance limitation given the ability to compile the single matching/choosing patterns/tests. I believe this solves all the problems encountered so far and reuses much of the good patterns that XSLT proposed, while removing the limitations and rought edges that XSLT has in some areas. While incredibly flexible, I don't think this proposal is based on more flexibility than it requires... but keep in mind this, even if finalized, will be the sitemap version 1.0 and there will be other versions in the future. Anyway, for what I'm able to see now, I think this thing rocks the party. Let me know your comments -- Stefano Mazzocchi One must still have chaos in oneself to be able to give birth to a dancing star. Friedrich Nietzsche -------------------------------------------------------------------- Missed us in Orlando? Make it up with ApacheCON Europe in London! ------------------------- http://ApacheCon.Com ---------------------