forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicola Ken Barozzi <nicola...@apache.org>
Subject Re: [VOTE] Usage of file.hint.ext convention
Date Mon, 02 Sep 2002 11:51:13 GMT

Steven Noels wrote:
> Nicola Ken Barozzi wrote:
> 
> <snip/>
> 
>>> But it requires the docwriter to think about the management concern, 
>>> IMO.
>>
>> They still write mydoc.xml or mydoc.gif, no?
>> This isn't a management concern, no?
> 
> 
> No, that's the way silly OS'es bind editing apps to filetypes. 

Simple Minds: "Quit dreaming thisis real life baby..." ;-)

> On Unix 
> and Mac OS, this has been solved in a more robust way, IMHO 
> (/etc/mime-magic and Resource Forks).

Resource forks?
It seems cool, what is it?

>>> I'm still thinking about those content-aware pipelines, and for some 
>>> app we are developing, we actually have been using this technique 
>>> doing a XML Pull Parse on the document to check its root element - 
>>> here, we could check for its DTD identifier.
>>
>>
>>
>> It's neat, but a PITA for many users.
> 
> 
> Howcome? Using CAPs (content-aware pipelines), the system decides what 
> will be done with their XML files, depending on the editing grammar they 
> used.

Because users are used ;-) to declare the mimetype in the filename.
And doctypes are difficult to write.

Ok, it *is* a PITA for many users, but not a blocking one.

>>> I'm vigourously opposing the idea of encoding meta-information twice 
>>> and in different places: inside the document, using its filename, and 
>>> in the request URI.
>>
>>
>>
>> Conceptually I agree, the hint is a "hack".
> 
> 
> Yes.
> 
>>> Consider this scenario:
>>>
>>> URI:
>>>
>>> http://somehost/documentnameA.html
>>> http://somehost/documentnameB.pdf
>>>
>>>
>>> source          step 1         |   step 2        step 3      step4
>>>                                |
>>> A.docv11.xml      -            |   web.xsl      (skin.xsl)   serialize
>>> B.docbook.xml   db2xdoc.xsl    |   paper.xsl                 serialize
>>>                                |
>>>                                ^
>>>                             logical
>>>                               view
>>>                            format [1]
>>>
>>>
>>> There's two concepts that could help us here:
>>>
>>> 1) content-aware pipelines, as being articulated in some form in 
>>> http://marc.theaimsgroup.com/?t=102767485200006&r=1&w=2 - the grammar

>>> of the XML source document as being passed across the pipeline will 
>>> decide what extra preparatory transformation steps need to be done
>>
>>
>>
>> Ok.
>>
>>> 2) views - simple Cocoon views instead of the current skinning 
>>> system, which would oblige us to seriously think of an intermediate 
>>> 'logical' page format that can be fed into a media-specific 
>>> stylesheet (web, paper, mobile, searchindexes, TOC's etc) resulting 
>>> in media-specific markup that can be augmented with a purely visual 
>>> skinning transformation
>>
>>
>>
>> Man, that's what I've been advocating all along.
> 
> I know - it's just that we add hack after hack to get Forrest out of the 
> door ASAP, which brings as further away from the silver bullet solution 
> (knowing very well that those don't exist, only in the mind of their 
> creators - but we should really try).

:-)

>> I think that the document.dtd can be such a step.
>> The switch to using XHTML for it is *exactly* this.
> 
> 
> I resonate with you on some intermediate format, but am strugling myself 
> with what format we should use. Remember XHTML still carries a lot of 
> structure-typographic elements like tables which can be misused in 
> various ways. 

Table is a semantic thing.
People can always abuse of things, heck, you can abuse of Cocoon in many 
ways, but as Stefano said it's unavoidable.
The protection you give users must be reasonable, not more.

> So what selection of XHTML elements/atts should we use for 
> that intermediate format then?

Se the archives, I did a comparison between documentDTD and XHTML wd2.

> And how will we support the tricks Bert 
> has been applying to the DTD documentation pipelines to have a much 
> better rendition for the element content model description? I don't have 
> the answer right now, but Marc and I are teasing each other to come up 
> with a definitive solution somewhere in time. 

Use div and span tags.

> Time is a bit limited now, 
> infortunately: I'm also readying the launch of cocoondev.org as I 
> promised in 
> http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=102881527330596&w=2
> 
>> Users that want to write a generic document use that dtd.
>> All other content that must be "skinned" by forrest must be 
>> pregenerated by other tools to give that dtd.
> 
> 
> +1
> 
>> We still have status.xml... etc files that get automatically 
>> transformed to that format.
> 
> 
> Yep - and I like that.
> 
>> I have been advocating the two step process since I started using 
>> Cocoon (see also mails to the cocoon users for example), so I'm +10000 
>> for it being formailzed :-D
>>
>>> Views are currently specified using the cocoon-view request 
>>> parameter, so maybe we could use the request-parameter Selector for 
>>> that purpose:
>>>
>>>       <map:match pattern="**">
>>>         <map:select type="request-parameter">
>>>           <map:parameter name="parameter" value="cocoon-view"/>
>>>           <map:when test="pdf">
>>>             pdf pipeline acting on a 'logical page' view?
>>>           </map:when>
>>>           <map:when test="html"/>
>>>         </map:select>
>>>       </map:match>
>>>
>>> Or we could write some Action which uses the URI to specify the 
>>> choosen view/rendition.
>>
>>
>>
>> *This* -1.
>>
>> The hack of putting the intermediate step in the name is to make URI 
>> space indipendent from the output space; you say that even that 
>> pollutes the URI (I agree), and this is a step back.
>>
>> The best think would be to understand something about the client 
>> automatically, but also a request parameter can be ok.
> 
> 
> I was thinking along the lines of choosing a Cocoon view based on the 
> request environment *and* CAPs, but I need to have a serious whiteboard 
> session on this - maybe it is time to host that Forrest hackaton over 
> here RSN.

:-D

>> The point is, can we use them in statically generated documentation?
>>
>> We cannot.  :-/
> 
> 
> I fail to see why, but I might be stuborn ;-)

When I persist the file on disk, I cannot write a file named 
"file.xml?view=mine".
If I encode it in the name, we're back to step one.

>> So we simply should say that the output format is given by the 
>> filename, but this is the output, not th input, and this brings us 
>> back to the problem that writers should concentrate on the input, and 
>> use that for the links to have view indipendence.
>>
>> See, browser technology constrains us :-/
>>
>>> I know all this is bring us to a slowdown, but I couldn't care less: 
>>> I feel we are deviating from best practices in favor of quick wins.
>>>
>>> Caveat: I haven't spent enough time thinking and discussing this, and 
>>> perhaps I have different interests (pet peeves) than others on the list.
>>
>>
>>
>> What you propose is the best route, but we need to be faster.
> 
> 
> Why? ;-)

;-)

>> Ok, let's go into it.
>>
>> 1) have two step process standard +1
>> 2) switch documentdtd to be the intermediate format and become akin to 
>> XHTML2 as in previous mails +1
> 
> 
> I need to revise documentv11, as I promised already (too) many times. 
> Can anyone drop the sky on me?

I did it in the XHTML comparison. Please take a look.

>> 3) use content-aware pipelines - see below
>> 4) link the sources, not the results.
> 
> 
> +1
> 
>> This is cool but what gets generated when I have
>>  file.xml -> file.html
>>  file.html -> file.html
>> Both in the same dir?
> 
> 
> Exactly what Marc just told me - I suggested going for a 
> ResourceExistAction there - would that help?

We already discussed about this, no, it doesn't.
The user has no real idea that one was chosen over the other.

>> If I link to file.xml, I get the link translated to file.html, but 
>> then what file do I get to see?
>>
>> This is the reason why we need a 1-1 relationship.
>>
>> Now to explain the why of the double ext (again):
>>
>> We have file.xml
>>
>> - user must link using the same filename
>>
>>  link href"file.xml"
>>
>> - the browser needs the filename with the resulting extension:
>>
>>  file.html
> 
> 
> Filename generation is part of the crawler, and we can patch/configure 
> that, if we want.

Yes but... |
            v

>> - the system needs to have unique names
>>
>> So this brings us *necessarily* to having both xml and html included 
>> in the extension.
>> xml for unicity, html for the browser.
>>
>> Or maybe have it just become with double extension only with clashing 
>> names, but then, how can the user tell to generate a pdf out of it if 
>> there is only .xml extension?
>>
>> You say it shouldn't know, because part of the view?
>> Go tell the users.
>> And how can they do it without breaking the uri?
>>
>> Ha.
> 
> 
> Duh ;-)
> 
> Will think about that some more, at least we have the discussion going 
> again :-)

:-)

Exactly why I brought the vote up.

The double ext solution is a quick solution to this problem, what Marc 
and I regard to as a minor hack.

If anyone has better suggestions, please bring them on, because we need 
to go forward.

                          <><><>

In essence, the problem is that the URL should expose both the content 
and the result hints.

Why?

- The content is needed for the writer and 1-1 mapping.
- The result is for the browser *and* to make more results available 
with the same resource (page.html and page.pdf).

Where can we put this in the URL?

In browsers, the result is in the extension (see also the .pdf extension 
problem and accompanied hacks and the fact that IE uses the extension 
instead of mime-type).

What remains is the content.

We can encode it in the file or in the path.
But since we want a clean URI that is semantically rich, we put it in 
the filename.

Hence the extension proposal.

           <><><>

Second proposal:

Nobody came up with the fact that the filename has semantics associated.

So:

  mypage.xml
  ->  /path/to/mypage/xml.html
  ->  /path/to/mypage/xml.pdf

  mypage.pdf
  ->  /path/to/mypage/html.pdf

Standard automatic view from live cocoon:

  mypage.xml
  ->  /path/to/mypage

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------


Mime
View raw message