From Berin Loritsch <>
Subject [RT] Negative Matching Constructs && Mime Type Matching
Date Tue, 27 Nov 2001 16:11:03 GMT
While using Cocoon to generate the developer's docs, I have discovered some use
cases where a more flexible matcher should be considered.  First, let me outline
how it works (now that we can follow links....):

1) By default, all requests for **.html files are handled by a default pipeline
2) We override this default for the developer's docs (have to perform includes for
     one big document)
3) We override this default for externally generated HTML files (i.e. the UML
4) We override this again for the API docs and return index.xml because it will be
     overridden later.

Why are the latter two even necessary?  They didn't used to be, but the URI following
in Cocoon does not allow me to exclude certain URI paths from Cocoon.  In a way, I
would like to have a method for negative matching, and a way to perform aggregated

Negative matching basically works so that Cocoon will IGNORE the matched uris.
For example, if I have a directive that will directly generate a 404 error, I can
satisfy this requirement like this:

<map:match pattern="api/**">
<map:generate-error type="404"/>

The Aggregate matching would change the way patterns are expressed.  I know we have
a RegExp matcher--which is great, but sometimes we want something more familiar:

<map:match pattern="{api|diagrams}/**">
<map:generate-error type="404"/>

This also opens the door for something equally powerful on the positive matching side:

<map:match pattern="**/images/*.{gif|jpg|png}">
<map:select type="parameter">
<map:parameter name="parameter" value="{3}"/>
<map:when test="jpg">
<map:read src="images/{2}.{3}" mime-type="image/jpeg"/>
<map:read src="images/{2}.{3}" mime-type="image/{3}"/>

Of course, then I would want to go a step further and explicitly state my mime-type
first in the select statement and only have one read.

<map:mime-type value="image/jpeg"/>

Although, we can apply separation of concerns again.  In other words, mime-type matching
is not always a concern of the sitemap.  URIs with standard extensions should not need
to have the mime-type matched by the sitemap.  IOW, standard extensions such as ".pdf",
".jpg", ".gif", ".png", ".rtf", etc. should have a table that automatically gets looked
up and applied to the response.  This can be be a mimetypes file that can either be a
simple properties file (there are only a flat hierarchy to mime-type resolution) or a
simplified configuration file:

<mime:entry extension="pdf" type="application/pdf"/>

Of course It can be grouped for maintenance purposes like this:

<mime:table group="application">
<mime:entry extension="pdf" type="pdf"/>
<mime:table group="image">
<mime:entry extension="jpg" type="jpeg"/>
<mime:entry extension="gif" type="gif"/>

The table approach can be further shortened by removing the "type" attribute if it is the
same as the extension.  Keep in mind that command line environments and other yet-to-be
created environments won't have the same mechanisms for mime-type resolution, and it should
be easy to create a general component that performs the resolution for you.  This way 90%
of all mime-type declarations can be resolved in a maintainable and uniform manner.

The only need for the mime-type would be when your URI does not have an extension, or it is


"Those who would trade liberty for
  temporary security deserve neither"
                 - Benjamin Franklin

