Return-Path: Delivered-To: apmail-jakarta-oro-user-archive@apache.org Received: (qmail 32095 invoked from network); 26 Mar 2002 01:32:01 -0000 Received: from unknown (HELO nagoya.betaversion.org) (192.18.49.131) by daedalus.apache.org with SMTP; 26 Mar 2002 01:32:01 -0000 Received: (qmail 4991 invoked by uid 97); 26 Mar 2002 01:32:07 -0000 Delivered-To: qmlist-jakarta-archive-oro-user@jakarta.apache.org Received: (qmail 4969 invoked by uid 97); 26 Mar 2002 01:32:06 -0000 Mailing-List: contact oro-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "ORO Users List" Reply-To: "ORO Users List" Delivered-To: mailing list oro-user@jakarta.apache.org Received: (qmail 4956 invoked from network); 26 Mar 2002 01:32:06 -0000 Date: Mon, 25 Mar 2002 20:31:58 -0500 (EST) From: bob mcwhirter X-Sender: bob@exeter.exeter.org To: ORO Users List cc: "'gunther@aurora.regenstrief.org'" Subject: RE: Comparing two Regex *Patterns* In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N I haven't looked at ORO's implementation, but typically regexps get 'compiled' into a state-machine, which is basically a directect graph. Some graph theory (along with some regexp-specific domain knowledge) should be able to do some analysis. -bob On Tue, 26 Mar 2002, Vollmer, Thomas - CannonSA wrote: > Gunther, > > I will face a similar problem in the near future and would > be very interested what you and others find out about this > problem. I haven't looked at possible approaches myself yet, > as the issue isn't a pressing one at the moment. I am not a > regex expert but I have the feeling that this could be a > though one. If it is feasible though, I would of course be > all for adding it to ORO :-) > > Best regards, > Thomas > > > > -----Original Message----- > > From: Gunther Schadow [mailto:gunther@aurora.regenstrief.org] > > Sent: Monday, March 25, 2002 4:12 PM > > To: oro-user > > Subject: Comparing two Regex *Patterns* > > > > > > Hi, > > > > for an XSLT-based up-translation engine that I'm writing, I > > am using ORO matcher. Works very well (and better than the > > other two options), thanks! > > > > Now I wonder about the following and wanted some input from > > experts on theory of regular expressions and their compilers > > and matchers: is there some algebraic way of comparing two > > regex patterns (not a string with a pattern, but two patterns.) > > The point is to figure out if one pattern is (partially) > > contained in another pattern and if strings matched by one > > pattern are also matched by another pattern (i.e. the set > > of strings mathing pattern 2 is a subset of the set of > > strings matching pattern 1.) The use case for this is to > > find ambiguities between rules using such related patterns > > such that one can test these patterns in groups (or, for > > XSLT, assign higher priority to the more specific pattern.) > > The point is that the computer should make this analysis > > not the guy who defines a transformation using many regexes. > > > > Generally evaluating such similarity relationships could be > > fairly involved. But I think this question should be part of > > any theoretical discussion of regexes, and if anyone knows > > of some information about this please holler. Actually I am > > just now searching through the Annals of the ACM and there > > are a few pertinent articles that feel like this is a hard > > problem. Anyone knows of any implementations of any of this? > > Do people here feel that comparison operations between Patterns > > would be a good addition to ORO (if it is feasible)? > > > > thanks, > > -Gunther Schadow > > > > PS: I would appreciate if you could put my personal email > > address on your replies. Thanks. > > > > > > > > -- > > Gunther Schadow, M.D., Ph.D. > > gschadow@regenstrief.org > > Medical Information Scientist Regenstrief Institute for > > Health Care > > Adjunct Assistant Professor Indiana University School > > of Medicine > > tel:1(317)630-7960 > > http://aurora.regenstrief.org > > > > > > > > -- > > To unsubscribe, e-mail: > > > > For additional commands, e-mail: > > > > > > > ************************************ > If this email is not intended for you, or you are not responsible for > the delivery of this message to the addressee, please note that this > message may contain ITT Privileged/Proprietary Information. In such > a case, you may not copy or deliver this message to anyone. You should > destroy this message and kindly notify the sender by reply email. > Information contained in this message that does not relate to the > business of ITT is neither endorsed by nor attributable to ITT. > ************************************ > > > > -- > To unsubscribe, e-mail: > For additional commands, e-mail: > -- To unsubscribe, e-mail: For additional commands, e-mail: