Return-Path: Delivered-To: apmail-jakarta-commons-dev-archive@apache.org Received: (qmail 12448 invoked from network); 11 Dec 2001 20:08:19 -0000 Received: from unknown (HELO nagoya.betaversion.org) (192.18.49.131) by daedalus.apache.org with SMTP; 11 Dec 2001 20:08:19 -0000 Received: (qmail 11107 invoked by uid 97); 11 Dec 2001 20:08:18 -0000 Delivered-To: qmlist-jakarta-archive-commons-dev@jakarta.apache.org Received: (qmail 11091 invoked by uid 97); 11 Dec 2001 20:08:18 -0000 Mailing-List: contact commons-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Jakarta Commons Developers List" Reply-To: "Jakarta Commons Developers List" Delivered-To: mailing list commons-dev@jakarta.apache.org Received: (qmail 11080 invoked from network); 11 Dec 2001 20:08:17 -0000 Errors-To: Reply-To: From: "Scott Sanders" To: "'Jakarta Commons Developers List'" , Subject: RE: Thoughts on StringUtils architecture Date: Tue, 11 Dec 2001 12:04:54 -0800 Organization: Engineering Message-ID: <004201c1827f$2f0097b0$6ca8a8c0@nextance.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.2627 Importance: Normal In-Reply-To: <20011211190704.24955.qmail@web13306.mail.yahoo.com> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2462.0000 X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N But then you would just use Velocity :) Scott > -----Original Message----- > From: Laird Nelson [mailto:ljnelson@yahoo.com] > Sent: Tuesday, December 11, 2001 11:07 AM > To: commons-dev@jakarta.apache.org > Subject: Thoughts on StringUtils architecture > > > Here's an architectural thought that occurred to me. It's > related to the overall architecture of StringUtils. > > When munging text, frequently you want to work on isolated > Strings. But just as frequently you want to work on character > streams, which read their stuff in chunks. What happens if > you read a chunk, and the last two characters of that chunk > are the *first* two characters of your three-character-long > String-to-be-escaped? Passing it to a stateless > escape() method, for example, won't escape the last two > characters, because of course it doesn't know that the third > one is on the way. > > The architecture I've found that works pretty well--although > it seems like overkill when presented this way, so bear with > me--is something like this (apologies if anyone finds this > boringly obvious; it was a Moment for me and my slow brain :-)): > > Suppose you want to interpolate variable references ${like} > ${this} and replace them with, say, > System.getProperty("like") and System.getProperty("this"). > And suppose you want to do that work so that you can invoke > it from a standalone class like StringUtils or a Reader class. > > The best bet is to implement a parser that takes in a > StringBuffer (the raw text), some kind of value object that > holds the parser's state, and that returns something > convenient, like the StringBuffer interpolated so far, or the > new state. java.text.ParsePosition is a bare bones example > of this sort of thing, used for java.text.MessageFormat etc. > > That way if you call the parser several times on chunks of > text that look, for example, like this: > > Chunk 1: Hi, there, ${us > Chunk 2: er.name}! Earn free $ > Chunk 3: $$! > > ...the parser will report, via the state object, whether it's > done with a piece yet, and if you're in a Reader you can pay > attention to this and if you're in, say, StringUtils, you can > ignore it. > > Now if you invoke the parser from a standalone class like > StringUtils, you just ignore the fact that it's not done yet, > and you get, as > results: > > Result of munging chunk 1: Hi, there, ${us > Result of munging chunk 2: er.name}! Earn free $ > Result of munging chunk 3: $$! > > ...i.e. in this stupid case the same as what you put in. But > if you invoke the parser via a Reader using the same chunks, > you can see that you could build the Reader in such a way to > have a cache that would let you return this: > > Result of three read(char[], int, int) calls: Hi, there, lnelson! > Earn free $$$! > > So in general for greater-than-single-character munging, it pays to > create: > > 1. A parser where you pass in its initial state each time you > parse/munge the raw text > 2. A standalone method/class that simply invokes the parser once on > the supplied String and ignores whether it's done or not > 3. A reader (or writer, or both) that feeds the parser as little as > the parser needs to complete his work > > I bring this up simply to call attention to it--basically to > point out that one should remember character streams when one > is building String-whacking routines such as escape(). > > Cheers, > Laird > > __________________________________________________ > Do You Yahoo!? > Check out Yahoo! Shopping and Yahoo! Auctions for all of > your unique holiday gifts! Buy at http://shopping.yahoo.com > or bid at http://auctions.yahoo.com > > -- > To unsubscribe, e-mail: > unsubscribe@jakarta.apache.org> > For > additional commands, > e-mail: > > -- To unsubscribe, e-mail: For additional commands, e-mail: