Return-Path: Delivered-To: apmail-xml-cocoon-dev-archive@xml.apache.org Received: (qmail 60506 invoked by uid 500); 10 May 2002 07:42:16 -0000 Mailing-List: contact cocoon-dev-help@xml.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: list-post: Reply-To: cocoon-dev@xml.apache.org Delivered-To: mailing list cocoon-dev@xml.apache.org Received: (qmail 60494 invoked from network); 10 May 2002 07:42:15 -0000 From: "Marc Portier" To: Subject: RE: A Transformer in progress.... Date: Fri, 10 May 2002 09:42:25 +0200 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0) X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Importance: Normal In-Reply-To: X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N > > I'm working on a Transformer that processes specifically text nodes and > > using regular expressions, wraps matched portions of a node in > a tag. I'm > > really just getting started on it - I have the basics working, but still > > need to be able to specify the rules in an external file, etc. > It has been > > an interesting excercise so far, and the intent is to be able to detect > > things like dates, currency amounts, and units of measure in a > text node, > > and mark them for later processing. > > > > I am planning on allowing rules to be specified in an external file > > identified at componenet configuration, or directly in the component > > configuration. I am also planning on allowing the "replacement" to be a > > complete fragement with groups from the matched expression referencable > > (and replaced) in etiher attribute values or text nodes. (currently I > > merely enclose the match in a tag). > > > > Before I move on, has anyone else already done something like this? Does > > anyone (other than me) think it would be useful? > > did something similar around january, after that the interest/needs kinda shifted, so I didn't continu on it since, I still plan on taking it up round the summer or so (if you want I can make my current stuff available, and join in some discussions) the biggest difference however is that you're assuming input has good but too little markup (so you go for a transformer) while we were scratching the itch of pure text input and/or bad markup present (like HTML that jTidy can't handle, or even now when there is nekoHTML: whenever the regex approach is easier then the XSLT afterwards cause of the mess in the HTML) (so we chose a generator as lifeform) at the time we started we colided with some joint thinking activity about adding regex kind of support inside XSLT2.0, see for some of those discussions: http://www.biglist.com/lists/xsl-list/archives/200201/msg00488.html must say, I didn't follow the further development of xslt2.0 since, so maybe someone else could comment on any future for this kind of stuff inside the spec (and thus impls like xalan or saxon) > > Ok...now to my real question... > > > > Does anyone know of existing code that I can use to track and > identify if > > the current point in the SAX stream matches a simplified XPath > expression? > > I would really like to apply expression rule set based on an > XPath subset. Elas, don't know of such a beast (would be nice though) So you got me triggered about thinking about this :-) Can we define 'Simplified' xpath ? SAX gives you a timely snapshot of current position in the XML file so you would easily be able to have some kind of match for a simple hierarchy of elements (wouldn't even need to be root based I guess, even slide in some attribute tests should be possible, also position evaluation as long as it's not =last()... ) SAX just doesn't allow to look into the future so a lot of xpath you will not be able to do. > > > > Any comments or suggestions? You'll have to take a trade-off between how far you can cripple xpath to still support your needs, yet still can beat the maybe awkward but not totally wrong approch of building a temp DOM tree internally in your cocoon transformer to allow real xpath stuff upon? (after all something similar happens to some extend inside the xslt process he) > > > > Thanks! please keep us posted of your findings and progress -marc= --------------------------------------------------------------------- To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org For additional commands, email: cocoon-dev-help@xml.apache.org