commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun Thomas" <>
Subject RE: suggestion for new StringUtils.method
Date Tue, 11 Nov 2003 18:32:06 GMT

Here's the bug entry:


-----Original Message-----
From: Arun Thomas 
Sent: Tuesday, November 11, 2003 10:31 AM
To: Jakarta Commons Developers List
Subject: RE: suggestion for new StringUtils.method

Take a look at the following bug entry currently in Bugzilla.... Your idea seems to be an
expansion of the desired functionality described by the "bug".  I would certainly add the
additional behaviour you propose (escape characters to prevent tokenization) to this bug.

It would be interesting, I think, if this functionality was provided as a StringTokenizer
replacement (subclass?) in line with Stephen's comment on the bug.  (This could be as simple
as a delegation to a StringUtils method, or a StringUtils convenience method could delegate
to the StringTokenizer replacement.)  


-----Original Message-----
From: Inger, Matthew [] 
Sent: Tuesday, November 11, 2003 10:03 AM
To: ''
Subject: suggestion for new StringUtils.method

The following method might be extremely useful for people:

String [] undelimit(String input, char separatorChar, char quoteChar);

This method splits a string according to a Delimiter seperated format
(CSV) for example.  It takes into account quoting, as well as allowing for empty tokens. 
So a string like the following:

a, , b, "c,d,e", "f""g""h",

would return the following tokens:

1- a
2- <blank>
3- b
4- c,d,e
5- f"g"h
6- <blank>

It happens to strip leading whitespace, but i could always make that optional.

It's an extremely efficient algorithm which runs through the underlying character array one
character at a time to build the tokens.

Any thoughts?

To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message