forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven Noels <>
Subject Re: [VOTE] Usage of file.hint.ext convention
Date Mon, 02 Sep 2002 10:48:22 GMT
Nicola Ken Barozzi wrote:


>> But it requires the docwriter to think about the management concern, IMO.
> They still write mydoc.xml or mydoc.gif, no?
> This isn't a management concern, no?

No, that's the way silly OS'es bind editing apps to filetypes. On Unix 
and Mac OS, this has been solved in a more robust way, IMHO 
(/etc/mime-magic and Resource Forks).

>> I'm still thinking about those content-aware pipelines, and for some 
>> app we are developing, we actually have been using this technique 
>> doing a XML Pull Parse on the document to check its root element - 
>> here, we could check for its DTD identifier.
> It's neat, but a PITA for many users.

Howcome? Using CAPs (content-aware pipelines), the system decides what 
will be done with their XML files, depending on the editing grammar they 

>> I'm vigourously opposing the idea of encoding meta-information twice 
>> and in different places: inside the document, using its filename, and 
>> in the request URI.
> Conceptually I agree, the hint is a "hack".


>> Consider this scenario:
>> URI:
>> http://somehost/documentnameA.html
>> http://somehost/documentnameB.pdf
>> source          step 1         |   step 2        step 3      step4
>>                                |
>> A.docv11.xml      -            |   web.xsl      (skin.xsl)   serialize
>> B.docbook.xml   db2xdoc.xsl    |   paper.xsl                 serialize
>>                                |
>>                                ^
>>                             logical
>>                               view
>>                            format [1]
>> There's two concepts that could help us here:
>> 1) content-aware pipelines, as being articulated in some form in 
>> - the grammar 
>> of the XML source document as being passed across the pipeline will 
>> decide what extra preparatory transformation steps need to be done
> Ok.
>> 2) views - simple Cocoon views instead of the current skinning system, 
>> which would oblige us to seriously think of an intermediate 'logical' 
>> page format that can be fed into a media-specific stylesheet (web, 
>> paper, mobile, searchindexes, TOC's etc) resulting in media-specific 
>> markup that can be augmented with a purely visual skinning transformation
> Man, that's what I've been advocating all along.

I know - it's just that we add hack after hack to get Forrest out of the 
door ASAP, which brings as further away from the silver bullet solution 
(knowing very well that those don't exist, only in the mind of their 
creators - but we should really try).

> I think that the document.dtd can be such a step.
> The switch to using XHTML for it is *exactly* this.

I resonate with you on some intermediate format, but am strugling myself 
with what format we should use. Remember XHTML still carries a lot of 
structure-typographic elements like tables which can be misused in 
various ways. So what selection of XHTML elements/atts should we use for 
that intermediate format then? And how will we support the tricks Bert 
has been applying to the DTD documentation pipelines to have a much 
better rendition for the element content model description? I don't have 
the answer right now, but Marc and I are teasing each other to come up 
with a definitive solution somewhere in time. Time is a bit limited now, 
infortunately: I'm also readying the launch of as I 
promised in

> Users that want to write a generic document use that dtd.
> All other content that must be "skinned" by forrest must be pregenerated 
> by other tools to give that dtd.


> We still have status.xml... etc files that get automatically transformed 
> to that format.

Yep - and I like that.

> I have been advocating the two step process since I started using Cocoon 
> (see also mails to the cocoon users for example), so I'm +10000 for it 
> being formailzed :-D
>> Views are currently specified using the cocoon-view request parameter, 
>> so maybe we could use the request-parameter Selector for that purpose:
>>       <map:match pattern="**">
>>         <map:select type="request-parameter">
>>           <map:parameter name="parameter" value="cocoon-view"/>
>>           <map:when test="pdf">
>>             pdf pipeline acting on a 'logical page' view?
>>           </map:when>
>>           <map:when test="html"/>
>>         </map:select>
>>       </map:match>
>> Or we could write some Action which uses the URI to specify the 
>> choosen view/rendition.
> *This* -1.
> The hack of putting the intermediate step in the name is to make URI 
> space indipendent from the output space; you say that even that pollutes 
> the URI (I agree), and this is a step back.
> The best think would be to understand something about the client 
> automatically, but also a request parameter can be ok.

I was thinking along the lines of choosing a Cocoon view based on the 
request environment *and* CAPs, but I need to have a serious whiteboard 
session on this - maybe it is time to host that Forrest hackaton over 
here RSN.

> The point is, can we use them in statically generated documentation?
> We cannot.  :-/

I fail to see why, but I might be stuborn ;-)

> So we simply should say that the output format is given by the filename, 
> but this is the output, not th input, and this brings us back to the 
> problem that writers should concentrate on the input, and use that for 
> the links to have view indipendence.
> See, browser technology constrains us :-/
>> I know all this is bring us to a slowdown, but I couldn't care less: I 
>> feel we are deviating from best practices in favor of quick wins.
>> Caveat: I haven't spent enough time thinking and discussing this, and 
>> perhaps I have different interests (pet peeves) than others on the list.
> What you propose is the best route, but we need to be faster.

Why? ;-)

> Ok, let's go into it.
> 1) have two step process standard +1
> 2) switch documentdtd to be the intermediate format and become akin to 
> XHTML2 as in previous mails +1

I need to revise documentv11, as I promised already (too) many times. 
Can anyone drop the sky on me?

> 3) use content-aware pipelines - see below
> 4) link the sources, not the results.


> This is cool but what gets generated when I have
>  file.xml -> file.html
>  file.html -> file.html
> Both in the same dir?

Exactly what Marc just told me - I suggested going for a 
ResourceExistAction there - would that help?

> If I link to file.xml, I get the link translated to file.html, but then 
> what file do I get to see?
> This is the reason why we need a 1-1 relationship.
> Now to explain the why of the double ext (again):
> We have file.xml
> - user must link using the same filename
>  link href"file.xml"
> - the browser needs the filename with the resulting extension:
>  file.html

Filename generation is part of the crawler, and we can patch/configure 
that, if we want.

> - the system needs to have unique names
> So this brings us *necessarily* to having both xml and html included in 
> the extension.
> xml for unicity, html for the browser.
> Or maybe have it just become with double extension only with clashing 
> names, but then, how can the user tell to generate a pdf out of it if 
> there is only .xml extension?
> You say it shouldn't know, because part of the view?
> Go tell the users.
> And how can they do it without breaking the uri?
> Ha.

Duh ;-)

Will think about that some more, at least we have the discussion going 
again :-)


Steven Noels                  
Outerthought - Open Source, Java & XML Competence Support Center            

View raw message