commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From n...@dad.org
Subject [util.regex] A package that provides the power of regular expressions without regular expressions
Date Thu, 16 Jul 2015 16:51:15 GMT
(I am sorry if this Email is improperly addressed. If it is, would somebody
tell me how to address it).

We have a package, which provides Java programmers with the power of
regular expressions without regular expressions. We would like to make the
package part of Apache Commons. Can somebody tell us what steps are involved?

>From the package's package description:

Provides the power of regular expressions without regular expressions.

Some people, when confronted with a problem, think, "I know, I'll use regular
expressions." Now they have two problems. - Jaimie Zawinski

Naomi is a package that enables Java programmers to search for and optionally
replace textual patterns in documents or strings. Simple tasks of this
sort--such as systematically changing multiple filenames or modifying text in
documents--are often performed using inline or one-shot scripts written using
tools such as awk, sed, or Perl, all of which use (somewhat different)
variants of regular expression ("RE") syntax. The standard java.util.regex
package provides its own set of RE tools for use in Java. But performing
complex searching and match-and-replace tasks or performing such tasks
repetitively with minor variations can be quite difficult using REs, and RE
syntax is inherently alien to Java's object-oriented style of programming.
Naomi offers a much more transparent way of attacking such problems that is
more compatible with Java style and much better suited to larger problems.

That is, Naomi provides the power of regular expressions without regular
expressions.

Although regular expressions offer a quick and terse way to perform simple
matching and replacement operations, they are infuriatingly difficult to
scale up to more complex problems. Their terseness, arcane notation, reliance
on the heavy use of escape characters, and lack of modularity and
encapsulation make them difficult to write, read, understand, maintain,
modify, reuse, and extend, for all but the simplest of problems. Naomi's
approach, though considerably less terse, offers far greater clarity,
modularity, reusability, and extensibility. Naomi is not intended as a
replacement for regular expressions in simple cases but rather as an
alternative for tackling more complex problems and for producing more
transparent solutions.

Even pattern matching problems that start out seeming to be simple have a way
of turning out to be more complex than they first appeared--either because
they evolve to cover additional cases or because we realize that our original
solution was insufficiently general. Naomi can therefore be a better approach
even for simple problems, since they often turn into complex ones.

Naomi provides two main classes, Pattern and Matcher, which are analogous to
the corresponding classes in the java.util.regex package. These Naomi
classes, along with the subclasses of its Pattern class and some other
ancillary classes, provide a way to match patterns in strings without using
regular expressions.

Java programmers who are familiar with the use of regular expressions should
have little trouble grasping the Naomi paradigm and reaping its advantages
for complex pattern matching and replacement. Those who have not been exposed
to REs should find Naomi a kinder, gentler approach to dealing with textual
pattern matching and replacement, in a way that scales well to large
problems.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Mime
View raw message