cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Washeim <esa...@canuck.com>
Subject Re: XML Query language & Site Skins
Date Sun, 20 Feb 2000 20:44:44 GMT
on 18/2/2000 13.05, Stefano Mazzocchi at stefano@apache.org wrote:

> Mark Washeim wrote:
>> 
>> Ok. I think I've resolved the problem of using the sitemap, as I saw it
>> being a problem.
>> 
>> To be to the point:
>> Problem 1. The site map was not good means whereby authors in a shared
>> namespace could use each other's subset of documents
>> Problem 2. The site map 'appeared' to bind THE URI of a document to
>> 'several' views, with no means of determining which view is THE model.
>> 
>> This problem is ameliorated (that's just to trip up the non-english speaker
>> for fun), I mean eased, if I proceed, as follows (I'm going to use a
>> Localization example because of Ulrich's earlier mention).
>> 
>> You'll note that nothing I'm saying is in conflict with the site-map but
>> extends it (or supplements it, not sure which yet).
>> 
>> DTDs will REQUIRE schemas. Schemas MAY be authoritative only within a target
>> namespace (as per the recomendation). That will permit for namespace
>> delimited URIs. Schemas will be extended (Derived) accross namespaces.
>> 
>> Ok, you can guess everything from a familiarity (I'm assuming it :) )with
>> the spec.
>> 
>> The important this is that document authors will NOT be working against the
>> site-map (nor will other content managers), but against the aggregate
>> (composite, whatever) schema. That's perfectly congruent with Cocoon's
>> contracts.
>> 
>> Since the author and manager views will NEVER be the 'public' views, the
>> content model will be what they work against, not views exposed through the
>> site map.
>> 
>> It will faciliItate addressing multiple doctypes accross domains with an
>> authoritative schema delimiting the namespaces and permiting for
>> permutations of document objects.
>> 
>> It WILL also, however, require for the public views (the more mixed up HTML
>> created at the direction of marketing departments) to be produced using
>> 'producers' (generators?) that also work against the schema, but are NOT
>> authoritative. That is, do not have a direct relationship to the content
>> model namespaces.
>> 
>> Since the site-map does not deal with namespaces but as if everything lived
>> in virtual file system, it's not a problem.
>> 
>> The site-map, for every public part of the document namespace, should have
>> ONE authritative (a la Lee) URI (in this case it will be a producer or chain
>> of producers, whatever) to a view and then whatever number of additional
>> ones that may be required.
>> 
>> The content managers won't be exposed to the content models in the same way
>> that everyone else is, however (ie, cocoon). They'll be working against the
>> schema (perhaps the default producer in the site-map????).
>> 
>> We (Large Medium) will create some mechanisms for creating producers from a
>> set of schemas. Perhaps in an automated way (as per the xsp compilation
>> mechanism??? can't remember if xsp pages are 'producers')
>> 
>> Stefano, remember I kept referring to a 'catalog' of entities to derive
>> views. What I meant but didn't explain clearly was that I wanted 'pure'
>> views, editors views of the content model. As you would obtain by having the
>> schema in front of you. Of course, we'll provide the 'affordances' of the
>> gui.
>> 
>> The following is an example in terms of content management/editing (schemas)
>> and cocoon site-map (derived from w3c examples):
>> 
>> -- Quote --
>> <type name="WorldAddress"
>> source="po:Address" derivedBy="extension">
>> <element name="country" type="string"/>
>> </type>
>> 
>> <type name="GermanAddress"
>> source="po:WorldAddress" derivedBy="extension">
>> <element name="land" type="string/>
>> </type>
>> 
>> <element name="person">
>> <type>
>> . . .
>> <element name="address" type="po:Address"/>
>> </type>
>> </element>
>> 
>> <person>
>> ...
>> <address>
>> ...
>> </address>
>> </person>
>> 
>> <person>
>> <address xsi:type="GermanAddress">
>> ...
>> <country>Germany</country>
>> <land>Saarland</land>
>> </address>
>> Two types derived from the Address type defined in Sample Schema
>> (non-normative) (§F) are defined, adding first a country and then a land
>> element to its required content. Two schema-valid instances of an element
>> declared with type Address are shown, one using that type itself, and
>> therefore not requiring disambiguation, and one using the xsi:type attribute
>> to indicate that it is using the GermanAddress type.
>> -- End Quote
>> 
>> <process uri="/HQAddressModel" private="true">
>> <generator type="schema" src="/THESchemaNSURIWorldAddress"
>> <filter>
>> <parameter name="stylesheet" value="defaultEditor.xsl"/>
>> </filter>
>> <serializer type="HTML"/>
>> </process>
>> 
>> <process uri="/HQAddress">
>> <generator type="schema" src="THESchemaNSURIWorldAddress"/>
>> <filter>
>> <parameter name="stylesheet" value="defaultView.xsl"/>
>> </filter>
>> <serializer type="HTML"/>
>> </process>
>> 
>> <process uri="/GermanAddressModel">
>> <generator type="schema" src="THESchemaNSURIGermanAddress"/>
>> <filter>
>> <parameter name="stylesheet" value="defaultEditor.xsl"/>
>> </filter>
>> <serializer type="FO"/>
>> </process>
>> 
>> <process uri="/GermanAddress">
>> <generator type="schema" src="THESchemaNSURIGermanAddress"/>
>> <filter>
>> <parameter name="stylesheet" value="GermanView.xsl"/>
>> </filter>
>> <serializer type="HTML"/>
>> </process>
>> 
>> Ok, so far the Name spaces are all shared. However, I'm not stuck. I can use
>> a different target namespace for a derived schema and keep contracts and
>> models in tact (for the managers). For example:
>> 
>> <process uri="/MarksGermanAddress">
>> <generator type="schema" src="THESchemaNSURIMarksAddress"/>
>> <filter>
>> <parameter name="stylesheet" value="MarkView.xsl"/>
>> </filter>
>> <serializer type="mathML"/> (he, he, he)
>> </process>
>> 
>> Hmmm. This still needs some work, but I think the intention is clear.
> 
> Yes, I started earlier to see the light in your concerns and now I know
> what you intended.
> 
> I've been referring to this as "schema reaction". Meaning that the
> pipeline is "automatically" composed by "matching" pieces... or, in a
> sense, the pipeline could be "validated":
> 

Yes, yes, yes! This is exactly where I was heading. 'Schema reaction' is
precisely what I was thinking about. It's not just a question of validation.
It's also a question of 'instantiation'. That is, we can only properly
instantiate in view of the schema, not VIEWS and that's been my whole
problem all along. The site-map appeared to limit us to creating
producers/generators which were always views and not THE model. Ok, so we
just introduce producers which are schema based AND, preferably, make the
site-map 'implement' schema namespaces, perhaps within zones (as for jserv).


> - generators create XML, but they _should_ inform Cocoon about what
> namespaces they contain.
>

Yes!
 
> - filters (I'm referring to the stylesheet itself as a filter, not the
> XSLT processor) have consumer some namespaces and producer some others.
> 
> - serializers consume some namespaces
> 
> Now, think of namespaces as shapes for puzzle pieces: sometimes, you are
> _forced_ to use one piece, or some other times, a piece can connect to
> all others.
> 
> Mark question can be put in this way: if we have those pieces on the
> table, can cocoon create the puzzle without needing special
> instructions?
> 
> or better
> 
> can cocoon require "less" building instructions and guess some of ouf
> the "shape" of the pieces?
>

Yeah! Now we're surfing. Pardon the hackneyed metaphor :)

 
> Let's analyze this:
> 
> - for puzzles, it's algorithmically certain (even if algorithimcally
> complex), to have a turing machine that creates the puzzle only if there
> are no pieces that have the same shape (for simplicity, I neglect the
> picture on the pieces, I just consider their shapes) It's a matter of
> finding the right piece but the topological analysis is trivial.
> 
> - in the lego brick case, all pieces have the same connector shape.
> Given a number of pieces, there is no way you can come up with the
> wanted lego model without _detailed_ instructions.
> 
> - for cocoon we are in between: there are some cases where one piece
> forces the next (when one spits a schema and only one other piece is
> able to consume it), but there are cases where a piece can connect with
> all of the the others (the xml serializer is able to serialize _any_ xml
> document)
> 
> So, the question becomes: how can we use schema producting/consuming
> information to reduce the size of the sitemap?
> 
> I've been thinking about that since the day I started thinking about the
> sitemap.
> 
> I _can feel_ it's possible... I just can't see how...
> 
> Do you have any idea?
> 


I'm thinking of something as terse as (I'm just going to use paths):
/domain/ 
/domain/implementationOne/

for instance:

/humanresources/
/humanresources/en/
/humanresources/fr/
/humanresources/fr/1999/
/humanresources/fr/1999/datestamp

ONLY the final namespace reference is to an actual document instance. The
rest is all schema namespace.

An interesting thing that comes out of this is that the namespaces, in
addition to providing 'authority' in terms of the data model, also point
toward a 'map' for authorization! Well, ok, I'm pushing out of the context
of the discussion, but I'm seeing that everything until we arrive at a
concrete document instance is obviously 'restricted'. That is, viewing it
will be limited to within the domain of authors, content managers. That's
interesting.

The main point is, the 'puzzle pieces' are schema within schema. As long as
you have a schema which contains the shared elements, a meta-schema, if you
will, there isn't any 'real' problem assembling the puzzle. That is, because
you actually have both the shapes of the pieces defined AND instructions for
putting them together (the schema itself).

Of course, that's BEFORE we get to the point of a serializer or other type
of consumer. I'll take up 'negotiating the schema' in producers, below . . .

>> I've tried to mix up the views and models, but, one thing that remains
>> unclear is whether you would actually use cocoon to map the namespace's for
>> content managers? Possibly, you just leave them out because, currently, the
>> post operations aren't dealt with within the frame work.???
> 
> The real problem is that you have to pre-process the documents to see
> what schema (read namespaces) they contain. And this is not always
> possible.
> 
> yes, we could provide some of these hooks in the
> generator/filter/serializer API that would help to "validate" the pipe
> and do some "sitemap debugging"... it would also help the visual sitemap
> authoring GUIs with puzzle-like shapes that can or cannot
> interconnect...
> 
> Gee, I've been playing around with those ideas for months now... but I
> can't see the light yet...
> 

What I'm imaginning for the sitemap is that generators/producers (why
generator, producer/consumer is so much clearer? generators make energy, not
'things' ????) will 'localize' to the namespace. That is, some producers,
notably toward the 'top' of the schema, will produce by implementing schema
derived models, others, lower in the scheme of inheritance/derivation will
need to do some negotiation vis-a-vis the schema.

Hmm. I'm starting to get fussy.

Ok. I'm not sure that it's the correct context, but, for the sake of
discussion, I'll stick to producers.

A schema based producer would have to implement a 'SchemaNegotiator' or a
'SchemaNavigator'. If we return to the idea of puzzle assembling machines,
it's obvious that what we have here is a question of indicating to the
machine WHICH puzzle to assemble. This puzzle (schema):

/humanresources/fr/1999/

is easy. The machine has ALL the shapes to hand AND the instructions for
composing the puzzle, in total.

This puzzle:

/humanresources/

if we permit derivation by extension and restriction, we find theoretically
of 'indefinite' (I think?) complexity. But, in practice, will it be a nicely
finite number? Moreover, we're not necessarily asking the machine to
assemble, only negotiate or navigate. That is, the machine must ask which
puzzle among puzzle's to assemble.

If we consider a shared namespace across many document types, a schema based
producer will have to negotiate:

<type name="WorldAddress" source="po:Address" derivedBy="extension">

or 

<address xsi:type="GermanAddress">

and the complexity will make it difficult for the producer to scale with
schema complexity, but that's not necessarily a short-term problem.

Without being able to suggest a proper solution to this negotiation problem
now, what I can suggest is that for our management/authoring problems, the
schema based producer doesn't have to function like a Turing machine, at
all. It only has to ask 'what puzzle do you wish me to complete' and then
present the choices.

Of course, we have to give the schema based producer some reasonable point
of departure. After all, many roads lead to Rome ! :)



> yes, Mark, the sitemap verbosity scares me too, but I think that the
> ways to reduce the sitemap verbosity are _not_ to be considered a design
> flaw of the sitemap namespace or of the model of contracts... rather the
> opposite: layering is the key. Once the sitemap is complete and it's in
> place, we'll deal with its management costs and try to reduce them with
> tools, xml-inheritance, xml-inclusion, xml-self-referencing and provide
> validation points with schema reaction.
> 
> But this, IMO, doesn't break the model, quite the opposite: it shows its
> flexibility.
> 
>> Hmmm. Food for thought.
> 
> yep, hope others follow us.
> 

I don't know if these primitive reflections on the use of the site-map and
schema based production are a useful contribution. I'm entering another
intense production period (ie. site launch in the offing), so I'm only
skimming the surface.

I do agree, as I alluded to earlier, that it's not a problem of the site-map
per se. Just a question of attending to distinction between:

Model (DTD or Schema) on the one hand, and
Implementation (XML document instance) on the other.????



>> I'm sure it's all old hat to you guys.
> 
> No, really, it helped to me clarify things that were still blurry in my
> head.
> 
>> Please let me know what I've missed or when I've stated the obvious.
> 
> What you've stated is _far_ from obvious, expecially since a vision of
> orthogonal namespaces is missing even inside the XML spec. The XSchema
> people are trying to patch the design flaws of the XML 1.0 spec... but
> it will take some time before people start to understand what XML is
> really about: namespaces.
> 

It wasn't obvious to me, initially. I, like everyone else was busy thinking
about the fact that the data model representation at the level of the
individual model was finally being addressed. I'm propelled to think beyond
that because we already have to consider: many clients, many client/domains.
I guess the pressure to encompass that complexity brings the namespace
issues to the forefront.

The other thing which prompted me was looking at the jserv zone based
preferences model and the use of aliases for servlet initialization.

In our case, we have something very much like the site-map approach in place
for servlets.

We have, in reality only ONE servlet (LMServlet) which is always intialized
with a set of ACTIONS. These actions implement a wrapper (or decorator)
pattern, hence, may be wrapped around one another in an, almost, arbitrary
order. The long and the short of it is that we minimize the the
administration to the central locale of the domain.properties file. Looks
like this in the file:

servlet 1
servlet.someAlias.code=LMServlet
servlet.someAlias.initArgs=ACTIONS=lm.saw.LogAction;lm.saw.SendTemplateActio
n;lm.client.domain.saw.someBusinessLogicAction;lm.saw.SIDAction,LOGFILE=/som
e/path/somelog.log,LOGFIELDS=SEC;PG,SOME_TABLE=someTable,ALIAS=someAlias

servlet 2
servlet.someAlias2.code=LMServlet
servlet.someAlias2.initArgs=ACTIONS=lm.saw.LogAction;lm.saw.SendTemplateActi
on;lm.client.domain.saw.someOtherBusinessLogicAction;lm.saw.SIDAction,LOGFIL
E=/some/path/somelog2.log,LOGFIELDS=SEC;PG,SOME_TABLE=someTable,ALIAS=someAl
ias2

Like cocoon (sort of), LMServlet is just a framework and like the sitemap,
we use the domain.properties file to simplify administration of the servlets
in the system. And, as you point out for the site-map in cocoon, this does
scale well.

Anyway, the main point is that the process directives in the site-map,
mapping producers, filters and serializers is similar to this, but more
complex. 

Ok, I'm drifting off topic, I think. ....

>> Quem colorem habet sapientia?'
> 
> Hmmmm, blue cocoon? :)

I think blue is definitely the right colour. I may be able to contribute a
graphic designer to do some logo design. Any one else volunteered yet???

-- 
Mark (Poetaster) Washeim

'On the linen wrappings of certain mummified remains
found near the Etrurian coast are invaluable writings
that await translation.

Quem colorem habet sapientia?'

Evan S. Connell

 



Mime
View raw message