Return-Path: Delivered-To: apmail-xml-forrest-dev-archive@xml.apache.org Received: (qmail 87256 invoked by uid 500); 8 Sep 2002 21:48:39 -0000 Mailing-List: contact forrest-dev-help@xml.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: list-post: Reply-To: forrest-dev@xml.apache.org Delivered-To: mailing list forrest-dev@xml.apache.org Received: (qmail 87247 invoked by uid 500); 8 Sep 2002 21:48:39 -0000 Delivered-To: apmail-xml-forrest-cvs@apache.org Received: (qmail 87244 invoked from network); 8 Sep 2002 21:48:39 -0000 Received: from icarus.apache.org (63.251.56.143) by daedalus.apache.org with SMTP; 8 Sep 2002 21:48:39 -0000 Received: (qmail 6638 invoked by uid 1454); 8 Sep 2002 21:48:38 -0000 Date: 8 Sep 2002 21:48:38 -0000 Message-ID: <20020908214838.6637.qmail@icarus.apache.org> From: stevenn@apache.org To: xml-forrest-cvs@apache.org Subject: cvs commit: xml-forrest/src/scratchpad/src/java/org/apache/forrest/components/sourcetype DocDeclRule.java DocumentElementRule.java ProcessingInstructionRule.java SourceInfo.java SourceType.java SourceTypeAction.java SourceTypeRule.java XmlSchemaRule.java X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Status: O X-Status: X-Keywords: stevenn 2002/09/08 14:48:38 Modified: src/documentation/content/xdocs book.xml Added: src/documentation/content/xdocs cap.xml src/scratchpad/lib nekopull.jar src/scratchpad/src/java/org/apache/forrest/components/sourcetype DocDeclRule.java DocumentElementRule.java ProcessingInstructionRule.java SourceInfo.java SourceType.java SourceTypeAction.java SourceTypeRule.java XmlSchemaRule.java Log: SourceAction or the so-called 'content aware pipelines' patch contributed by Bruno Dumon, This enables conditional processing of XML documents based on their grammar, being indicated by their DTD, XML Schema PI, root element or a PI in the prolog of the document. Revision Changes Path 1.21 +1 -1 xml-forrest/src/documentation/content/xdocs/book.xml Index: book.xml =================================================================== RCS file: /home/cvs/xml-forrest/src/documentation/content/xdocs/book.xml,v retrieving revision 1.20 retrieving revision 1.21 diff -u -r1.20 -r1.21 --- book.xml 18 Aug 2002 07:44:56 -0000 1.20 +++ book.xml 8 Sep 2002 21:48:38 -0000 1.21 @@ -33,12 +33,12 @@ - + 1.1 xml-forrest/src/documentation/content/xdocs/cap.xml Index: cap.xml ===================================================================
SourceTypeAction
Intro

SourceTypeAction assigns a "type" (a string) to an XML file. This is done based on information occuring in the header of the XML file, up to the document (root) element. This type is then returned to the sitemap as a variable with the name 'sourcetype'. If no matching sourcetype could be be found, null is returned and thus the contents of the action element will not be executed.

SourceTypeAction works by pull-parsing the document and collecting information such as the public id, the processing instructions, the document element local name and namespace, and the xsi:schemaLocation and xsi:noNamespaceSchemaLocation attributes. This information is then compared with the rules described in the configuration of the SourceTypeAction.

Configuration

The action should be declared and configured in the map:actions section of the sitemap. Example:

]]>

Each sourcetype-tag declares a source type. Inside the sourcetype-tag a number of rules can be defined, described below. The sourcetypes will be checked in the same order as they are defined in the configuration, the first sourcetype of which all rules match will be used.

These are the available rules:

document-declaration
This rule checks the public ID. It takes one attribute public-id.
document-element
This rule checks the local name and/or namespace of the document element. These are specified with the attributes local-name and namespace. At least one of these two is required.
processing-instruction
This rule checks a processing instruction. It can take two attributes: target and data. The target attribute is always required, the data attribute is optional.
w3c-xml-schema
This rule checks the value of the xsi:schemaLocation and xsi:noNamespaceSchemaLocation attributes on the document element. These are specified with the attributes schema-location and no-namespace-schema-location.
Usage

The source of which the sourcetype must be defined must be specified using the 'src' attribute on the map:act element.

... ]]>
1.1 xml-forrest/src/scratchpad/lib/nekopull.jar <> 1.1 xml-forrest/src/scratchpad/src/java/org/apache/forrest/components/sourcetype/DocDeclRule.java Index: DocDeclRule.java =================================================================== package org.apache.forrest.components.sourcetype; import org.apache.avalon.framework.configuration.*; /** * Rule which checks that the public id has a certain value. * * @author Bruno Dumon */ public class DocDeclRule implements SourceTypeRule { protected String publicId; public void configure(Configuration configuration) throws ConfigurationException { publicId = configuration.getAttribute("public-id"); } public boolean matches(SourceInfo sourceInfo) { if (publicId.equals(sourceInfo.getPublicId())) return true; else return false; } } 1.1 xml-forrest/src/scratchpad/src/java/org/apache/forrest/components/sourcetype/DocumentElementRule.java Index: DocumentElementRule.java =================================================================== package org.apache.forrest.components.sourcetype; import org.apache.avalon.framework.configuration.Configuration; import org.apache.avalon.framework.configuration.ConfigurationException; /** * A Rule which checks the local name and namespace of the document element. * * @author Bruno Dumon */ public class DocumentElementRule implements SourceTypeRule { protected String localName; protected String namespace; public void configure(Configuration configuration) throws ConfigurationException { localName = configuration.getAttribute("local-name", null); namespace = configuration.getAttribute("namespace", null); if (localName == null && namespace == null) throw new ConfigurationException("Missing local-name and/or namespace attribute on document-element element at " + configuration.getLocation()); } public boolean matches(SourceInfo sourceInfo) { if (localName != null && namespace != null && localName.equals(sourceInfo.getDocumentElementLocalName()) && namespace.equals(sourceInfo.getDocumentElementNamespace())) return true; else if (localName != null && localName.equals(sourceInfo.getDocumentElementLocalName())) return true; else if (namespace != null && namespace.equals(sourceInfo.getDocumentElementNamespace())) return true; else return false; } } 1.1 xml-forrest/src/scratchpad/src/java/org/apache/forrest/components/sourcetype/ProcessingInstructionRule.java Index: ProcessingInstructionRule.java =================================================================== package org.apache.forrest.components.sourcetype; import org.apache.avalon.framework.configuration.Configuration; import org.apache.avalon.framework.configuration.ConfigurationException; /** * A rule which checks that a processing instruction with certain data is present. */ public class ProcessingInstructionRule implements SourceTypeRule { protected String target; protected String data; public void configure(Configuration configuration) throws ConfigurationException { target = configuration.getAttribute("target"); data = configuration.getAttribute("data", null); } public boolean matches(SourceInfo sourceInfo) { if (sourceInfo.hasProcessingInstruction(target)) { if (sourceInfo.getProcessingInstructionData(target) == null && data == null) return true; if (sourceInfo.getProcessingInstructionData(target) != null && sourceInfo.getProcessingInstructionData(target).equals(data)) return true; } return false; } } 1.1 xml-forrest/src/scratchpad/src/java/org/apache/forrest/components/sourcetype/SourceInfo.java Index: SourceInfo.java =================================================================== package org.apache.forrest.components.sourcetype; import java.util.HashMap; /** * Contains information about an XML file. More precisely, the publicId, the processing instructions * occuring before the document element, the local name and namespace of the document element, and * the value of the xsi:schemaLocation and xsi:noNamespaceSchemaLocation attributes. All of these * attributes can be null. * * @author Bruno Dumon */ public class SourceInfo { protected String publicId; protected String documentElementLocalName; protected String documentElementNamespace; protected String xsiSchemaLocation; protected String xsiNoNamespaceSchemaLocation; protected HashMap processingInstructions = new HashMap(); public String getPublicId() { return publicId; } public void setPublicId(String publicId) { this.publicId = publicId; } public String getDocumentElementLocalName() { return documentElementLocalName; } public void setDocumentElementLocalName(String documentElementLocalName) { this.documentElementLocalName = documentElementLocalName; } public String getDocumentElementNamespace() { return documentElementNamespace; } public void setDocumentElementNamespace(String documentElementNamespace) { this.documentElementNamespace = documentElementNamespace; } public String getXsiSchemaLocation() { return xsiSchemaLocation; } public void setXsiSchemaLocation(String xsiSchemaLocation) { this.xsiSchemaLocation = xsiSchemaLocation; } public String getXsiNoNamespaceSchemaLocation() { return xsiNoNamespaceSchemaLocation; } public void setXsiNoNamespaceSchemaLocation(String xsiNoNamespaceSchemaLocation) { this.xsiNoNamespaceSchemaLocation = xsiNoNamespaceSchemaLocation; } public void addProcessingInstruction(String target, String data) { processingInstructions.put(target, data); } public boolean hasProcessingInstruction(String target) { return processingInstructions.containsKey(target); } public String getProcessingInstructionData(String target) { return (String)processingInstructions.get(target); } } 1.1 xml-forrest/src/scratchpad/src/java/org/apache/forrest/components/sourcetype/SourceType.java Index: SourceType.java =================================================================== package org.apache.forrest.components.sourcetype; import org.apache.avalon.framework.configuration.*; import java.util.*; /** * Represents a sourcetype. A sourcetype has a name and a number of rules * which are used to determine if a certain document is of this sourcetype. * * @author Bruno Dumon */ public class SourceType implements Configurable { protected List rules = new ArrayList(); protected String name; public void configure(Configuration configuration) throws ConfigurationException { name = configuration.getAttribute("name"); Configuration[] ruleConfs = configuration.getChildren(); for (int i = 0; i < ruleConfs.length; i++) { SourceTypeRule rule; if (ruleConfs[i].getName().equals("document-declaration")) rule = new DocDeclRule(); else if (ruleConfs[i].getName().equals("processing-instruction")) rule = new ProcessingInstructionRule(); else if (ruleConfs[i].getName().equals("w3c-xml-schema")) rule = new ProcessingInstructionRule(); else if (ruleConfs[i].getName().equals("document-element")) rule = new DocumentElementRule(); else throw new ConfigurationException("Unsupported element " + ruleConfs[i].getName() + " at " + ruleConfs[i].getLocation()); rule.configure(ruleConfs[i]); rules.add(rule); } } public boolean matches(SourceInfo sourceInfo) { Iterator rulesIt = rules.iterator(); boolean matches = true; while (rulesIt.hasNext()) { SourceTypeRule rule = (SourceTypeRule)rulesIt.next(); matches = matches && rule.matches(sourceInfo); if (!matches) return false; } return matches; } public String getName() { return name; } } 1.1 xml-forrest/src/scratchpad/src/java/org/apache/forrest/components/sourcetype/SourceTypeAction.java Index: SourceTypeAction.java =================================================================== package org.apache.forrest.components.sourcetype; import org.cyberneko.pull.XMLPullParser; import org.cyberneko.pull.XMLEvent; import org.cyberneko.pull.event.*; import org.cyberneko.pull.parsers.Xerces2; import org.apache.xerces.xni.parser.XMLInputSource; import org.apache.avalon.framework.configuration.*; import org.apache.avalon.framework.thread.ThreadSafe; import org.apache.avalon.framework.parameters.Parameters; import org.apache.avalon.framework.logger.AbstractLogEnabled; import org.apache.cocoon.acting.Action; import org.apache.cocoon.environment.SourceResolver; import org.apache.cocoon.environment.Redirector; import org.apache.excalibur.source.Source; import java.util.*; /** * An action that assigns a "sourcetype" to a source. See the external documentation for * more information. * * @author Bruno Dumon */ public class SourceTypeAction extends AbstractLogEnabled implements Configurable, ThreadSafe, Action { protected List sourceTypes = new ArrayList(); protected static final String XSI_NAMESPACE = "http://www.w3.org/2001/XMLSchema-instance"; public void configure(Configuration configuration) throws ConfigurationException { Configuration[] sourceTypeConfs = configuration.getChildren("sourcetype"); for (int i = 0; i < sourceTypeConfs.length; i++) { SourceType sourceType = new SourceType(); sourceType.configure(sourceTypeConfs[i]); sourceTypes.add(sourceType); } } public Map act(Redirector redirector, SourceResolver sourceResolver, Map objectModel, String src, Parameters parameters) throws Exception { if (src == null || src.equals("")) throw new Exception("SourceTypeAction: src attribute should be defined and non-empty."); Source source = sourceResolver.resolveURI(src); XMLPullParser parser = new Xerces2(); parser.setInputSource(new XMLInputSource(null, src, null, source.getInputStream(), null)); // load nothing external parser.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false); parser.setFeature("http://xml.org/sax/features/external-general-entities", false); parser.setFeature("http://xml.org/sax/features/external-parameter-entities", false); // note: namespace-aware parsing is by default true SourceInfo sourceInfo = new SourceInfo(); try { XMLEvent event; while ((event = parser.nextEvent()) != null) { if (event.type == XMLEvent.DOCTYPE_DECL) { DoctypeDeclEvent doctypeDeclEvent = (DoctypeDeclEvent)event; sourceInfo.setPublicId(doctypeDeclEvent.pubid); } else if (event.type == XMLEvent.PROCESSING_INSTRUCTION) { ProcessingInstructionEvent piEvent = (ProcessingInstructionEvent)event; sourceInfo.addProcessingInstruction(piEvent.target, piEvent.data != null ? piEvent.data.toString() : null); } else if (event.type == XMLEvent.ELEMENT) { ElementEvent elementEvent = (ElementEvent)event; sourceInfo.setDocumentElementLocalName(elementEvent.element.localpart); sourceInfo.setDocumentElementNamespace(elementEvent.element.uri); sourceInfo.setXsiSchemaLocation(elementEvent.attributes.getValue(XSI_NAMESPACE, "schemaLocation")); sourceInfo.setXsiNoNamespaceSchemaLocation(elementEvent.attributes.getValue(XSI_NAMESPACE, "noNamespaceSchemaLocation")); // stop parsing after the root element break; } } } finally { parser.cleanup(); } Iterator sourceTypeIt = sourceTypes.iterator(); while (sourceTypeIt.hasNext()) { SourceType sourceType = (SourceType)sourceTypeIt.next(); if (sourceType.matches(sourceInfo)) { HashMap returnMap = new HashMap(); returnMap.put("sourcetype", sourceType.getName()); getLogger().debug("SourceTypeAction: found sourcetype " + sourceType.getName() + " for source " + src); return returnMap; } } getLogger().debug("SourceTypeAction: found no sourcetype for source " + src); return null; } } 1.1 xml-forrest/src/scratchpad/src/java/org/apache/forrest/components/sourcetype/SourceTypeRule.java Index: SourceTypeRule.java =================================================================== package org.apache.forrest.components.sourcetype; import org.apache.avalon.framework.configuration.Configurable; /** * Interface to be implemented by all soucetype-defining rules. * * @author Bruno Dumon */ public interface SourceTypeRule extends Configurable { /** * Returns true if this rule is satisfied by the given SourceInfo. */ public boolean matches(SourceInfo sourceInfo); } 1.1 xml-forrest/src/scratchpad/src/java/org/apache/forrest/components/sourcetype/XmlSchemaRule.java Index: XmlSchemaRule.java =================================================================== package org.apache.forrest.components.sourcetype; import org.apache.avalon.framework.configuration.Configuration; import org.apache.avalon.framework.configuration.ConfigurationException; /** * A Rule which checks the value of the xsi:schemaLocation and xsi:noNamespaceSchemaLocation * attributes. * * @author Bruno Dumon */ public class XmlSchemaRule implements SourceTypeRule { protected String schemaLocation; protected String noNamespaceSchemaLocation; public void configure(Configuration configuration) throws ConfigurationException { schemaLocation = configuration.getAttribute("schema-location", null); noNamespaceSchemaLocation = configuration.getAttribute("no-namespace-schema-location", null); if (schemaLocation == null && noNamespaceSchemaLocation == null) throw new ConfigurationException("Missing schema-location and/or no-namespace-schema-location attribute on w3c-xml-schema element at " + configuration.getLocation()); } public boolean matches(SourceInfo sourceInfo) { if (schemaLocation != null && noNamespaceSchemaLocation != null && schemaLocation.equals(sourceInfo.getXsiSchemaLocation()) && noNamespaceSchemaLocation.equals(sourceInfo.getXsiNoNamespaceSchemaLocation())) return true; else if (schemaLocation != null && noNamespaceSchemaLocation == null && schemaLocation.equals(sourceInfo.getXsiSchemaLocation())) return true; else if (schemaLocation == null && noNamespaceSchemaLocation != null && noNamespaceSchemaLocation.equals(sourceInfo.getXsiNoNamespaceSchemaLocation())) return true; else return false; } }