forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marc Portier" <>
Subject RE: Impossible to integrate PDF documents in forrest site?
Date Mon, 26 Aug 2002 11:16:29 GMT

> -----Original Message-----
> From: Nicola Ken Barozzi []
> Jeff Turner wrote:
> > On Thu, Aug 22, 2002 at 05:30:48PM +0200, Marc Portier wrote:
> >
> >>>-----Original Message-----
> >>>From: Nicola Ken Barozzi []
> >>>
> >>>>  myfile.pipeline.extension
> >>>>
> >>>>I like it.
> >>>
> >>>Expanding a bit on this:
> >>>
> >>>We can dictate that every file in the contents can
> >>>have a double extension.
> >>
> >>there was even mentioning of more then two at the
> moment the idea
> >>got me going ....
> >><snip
> >>from="
> 029759626196
> >>43&w=2">
> >>Next to that I don't see why there would be no room
> for something
> >>like
> >>testresult.metric.svg.jpg?
> >>
> >>Looking at it from this angle the multiple parts of the new
> >>extension become like a route or a trail describing how to
> >>get from *.metric.xml to jpg via svg. (Supposing there could
> >>be more then one route)
> >
> >
> > Back in Cocoon 1 days, each XML file had processing
> instructions at the
> > top, telling Cocoon what series of transformations
> to apply to it, and
> > what it's final content type should be:
> >
> > <?cocoon-process type="xsp"?>
> > <?cocoon-format type="text/html"?>
> >
> > With Cocoon 2, the whole idea of the sitemap was to
> get rid of this, and
> > have a single point of control.
> Not really.
> The idea is getting rid og the *reaction* that this
> provides, ie the
> fact that a PI is used to generate something that has
> a PI that is used
> to generate something etc...

indeed. this is different

> > Now the idea of having 'myfile.pipeline.extension'
> seems identical to
> > PIs, only not as clean. It takes control away from
> the sitemap.
> > Subversion of control.
> No, the sitemap is still the only whay of defining the
> pipelines.

and the file extensions still do what you want them to do: give
at fs level a visual clue to what the file is about, having more
dots in there allows for grasping more semantical nuances of that

> Are already taking some control away from the sitemap
> when we use
> extensions?
> No, since the goal of the sitemap is to assemble
> pipelines, and to
> select which pipeline to apply given some rules.
> myfile.pipeline.extension is just another rule.


> If one day I decide to change what the pipeline does,
> I can do it from
> the sitemap without touching the files, hence the
> sitemap is definately
> in control.

> > Or have I missed something? What's wrong with a
> simple action that checks
> > of a PDF exists?
> It does only one check.
> Of course, we can continue to check with some rules
> till we find the
> first available file, but it's not really different
> given that this file
> has to be recognizable, maybe with an extension.

my -1 on this would be based on predictability
it is a bit like scalars in perl versus strong type-checking in
I'ld like to be sure of what comes out even if that requires more
attention and discpiline from the user.
He should be helped through meaningful messages and logs though.

> The most right thing to do would seem to be to check
> the contents of the
> file to understand the MIME/TYPE, but also this is too
> little, because
> it maps MIME/TYPE<->pipeline.
> The fact is that the author *must* tell the pipeline
> about its stuff, as
> it uses tags in the document instead of plain text.
> The important thing is that it does it with generic semantics.
> >></snip>
> >>
> >>
> >>>The output of Forrest will never show the internal
> >>>extension, it's only
> >>>a hint to the pipelines about the content of the file.
> But it can still remain as a semantical piece of info,
> now that I think
> of it.

by the way the multi-dot file is not edited by the author: he is
delivering the bare xml format
it would only be the URL, and thus the saved-by-crawler file that
will show some trail of where it originated from.

So you could see it more as allowing the URL-hacker to specify
the output-format.


View raw message