commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stephen Colebourne" <scolebou...@btopenworld.com>
Subject Re: [lang] [Bug 22692] - StringUtils.split ignores empty items
Date Fri, 14 Nov 2003 23:16:04 GMT
Should work, I'll let you check the speed ;-)
Stephen

----- Original Message ----- 
From: "Inger, Matthew" <inger@Synygy.com>
> I see what you mean.  It appears, as robust as CharSet it, is
> does way too much, and is slow for what we need it for.
> 
> I'm going back to DelimiterSet, but rather than an interface,
> it will be an inner class with several constructors:
> 
> public DelimiterSet(char[]);
>       public DelimiterSet(String);
>       public DelimiterSet(char);
> 
> and two useful methods:
> 
> public boolean contains(char);
>       public char[] getChars();
> 
> This will be an immutable object.  The
> constructor sorts the character array
> using Arrays.sort, and the contains method
> uses Arrays.binarySearch.  This should give
> us a pretty efficient algorithm for the
> contains method.  There's also a predefined
> whitespace delimiter set "WHITESPACE_DELIMITERSET"
> so people don't have to construct their own
> all the time.
> 
> -----Original Message-----
> From: Stephen Colebourne [mailto:scolebourne@btopenworld.com]
> Sent: Friday, November 14, 2003 5:26 PM
> To: Jakarta Commons Developers List
> Subject: Re: [lang] [Bug 22692] - StringUtils.split ignores empty items
> 
> 
> An interesting idea, although the performance would be very poor without
> some effort in the CharSet class.
> Stephen
> 
> From: "Todd V. Jonker" <todd@consciouscode.com>
> > Or just use lang.CharSet
> >
> >
> > On Fri, 14 Nov 2003 16:58:45 -0500, "Inger, Matthew" <inger@Synygy.com>
> > said:
> > > What about an interface:
> > >
> > > public class DelimitedTokenizer {
> > >
> > >    public static interface DelimiterSet {
> > >        public boolean isDelimiter(char c);
> > >    }
> > > }
> > >
> > > and having the ability to pass in this
> > > interface.  Of course, we'd still have a
> > > single char version as well, so someone
> > > might pass either a single char or an implementation
> > > of this interface as the delimiter.  I suppose I could
> > > do the same thing for quotes, but i find that less useful.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: commons-dev-help@jakarta.apache.org
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message