forrest-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maurice Lanselle <lanselle.als...@evc.net>
Subject Re: Sitemap dot xmap sourcetyping
Date Tue, 07 Jun 2005 07:33:11 GMT
David Crossley said the following on 07/06/2005 03:46:

>Maurice Lanselle wrote:
>  
>
>>Ross Gardler said the following:
>>    
>>
>>>However, you should avoid limiting the URL space of your application 
>>>by requiring a given file type to have a given filename or path. This 
>>>can result in false matches. You should use the SourceTypeResolver, 
>>>for an example of how see 
>>>http://svn.apache.org/viewcvs.cgi/forrest/trunk/plugins/org.apache.forrest.plugin.input.simplified-docbook/input.xmap?view=markup

>>>for an example of how to do this.
>>>      
>>>
>>I totally buy your point about not requiring a document type to have a 
>>given file name or path.  Since I'm trying to use or produce xml files 
>>which have the ".xml" extension rather than something more distinctive, 
>>I would like Forrest to choose the stylesheet on the basis of the 
>>doctype, perhaps using a catalog (like for resolving DTDs...why not 
>>catalogs for xsl?).  ...
>>    
>>
>
>No, the "catalog entity resolver" addresses a separate part of the issue.
>It sounds like we need to enhance the documentation. Source Type Resolver,
>actually called "SourceTypeAction (content aware pipelines)" [2] is one of
>the key features of Forrest and so we need to explain it better.
>
>Lets first correct your comment about "catalog". Its use is to create
>an efficient system for xml documents that declare a DTD so that the
>xml parser gets a local copy rather than going across the network.
>
>  
>
Sorry if I wasn't clear.  When I referred to using *a* catalog (not 
*the* catalogs used for validation resolving) I meant just the concept 
of a look-up table the pipeline could use to identify a 
document-type-specific handling to apply.  That seems to be what is 
being done in the sitemap.  If I understand your SourceTypeAction doc 
and what is being defined in the input.xmap (url above),
a) one can define classification rules for xml documents based on some 
types of document header information,
b) apply these rules to determine a "sourcetype" (=classification)
c) use the "sourcetype" (classification) to select the processing to apply.

a) is done in the map:actions section, as you explain in the doc.

b) is done by "map:act" (=a function call) when the document is 
encountered and a processing decision is to be taken:

<map:act type="sourcetype" src="{1}">
or 
<map:act type="sourcetype" src="{src}">

c) Is done by a "select:parameter-selector-test:when" construct 
resembling a select-case in a "resource" (=subroutine) named 
"transform-to-document":

<map:resource name="transform-to-document">
      <map:act type="sourcetype" src="{src}">
        <map:select type="parameter">
          <map:parameter name="parameter-selector-test" value="{sourcetype}" />

          <map:when test="docbook-v4.1.2">
            <map:generate src="{project:content.xdocs}{../../1}.xml" />
            <map:transform src="{forrest:plugins}/org.apache.forrest.plugin.input.simplified-docbook/resources/stylesheets/sdocbook2document.xsl"
/>
            <map:serialize type="xml-document"/>
          </map:when>
...

>Now back to Source Type Action ... It is a Cocoon sitemap component that
>peeks at the top-part of a document to look for hints about the type
>of the document. 
>
>[1] http://forrest.apache.org/docs/your-project.html#sitemap.xmap 
>[2] http://forrest.apache.org/docs/cap.html
>
>These are the available methods:
>document-declaration
>document-element and namespace
>processing-instruction
>w3c-xml-schema
>
>  
>
While reading the SourceTypeAction doc, a couple of questions came to 
mind.  I think it would be helpful to find their answers in that doc. :

1) What is the appropriate way to construct "OR" classification rules?  
For instance, the document-element may return a local-name, a namespace, 
or both. Should one define two (or more) rules with the same sourcetype 
name, such as...

<sourcetype name="foo">
    <document-element local-name="foo">
</sourcetype>
<sourcetype name="foo">
    <document-element namespace="bar">
</sourcetype>

or a single rule with a list of alternative conditions, such as...

<sourcetype name="foo">
    <document-element local-name="foo">
    <document-element namespace="bar">
</sourcetype>

or is there some other syntax?

2) How does one construct "AND" classification rules?

<sourcetype name="foo">
    <document-element local-name="foo"> && <document-element namespace="bar">
</sourcetype>


These are not urgent (for me), but I expect they will be wanted sooner 
or later.

Regards and thanks for the communication,
Maurice

>If you use the first technique, then the parser needs to go retrieve
>the DTD from across the network. Hence the need for Catalog Entity Resolver.
>
>I don't use "w3c-xml-schema" so i am not sure if the parser is forced
>to locate the actual schema. I gather that it doesn't. Therefore you
>don't need to mess about with catalogs.
>
>Now if there are Java people out there listening, then perhaps you would
>like to enhance the Source Type Action to enable other methods. It is in
>the Forrest source at main/java/org/apache/forrest/sourcetype
>
>--David
>
>  
>
>>...  It looks like that is what happens in the 
>><map:resources> group in the example you pointed me to (below).  My 
>>first attempt to *bend* it to my purpose failed, however: "Type 
>>'sourcetype' does not exist for 'map:pipeline' at..." when I replaced
>>
>> <map:pipeline>
>>  <map:match pattern="**Resume.xml">
>>by
>> <map:pipeline type="sourcetype" src="{src}">
>>       <map:select type="parameter">
>>         <map:parameter name="parameter-selector-test" 
>>value="{sourcetype}" />
>>         <map:when test="Resume">
>>
>>But one thing at a time...xslt first, then plugin/resolving.
>>
>>Many thanks,
>>Maurice
>>    
>>
>
>  
>


Mime
View raw message