commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun Thomas" <arun.tho...@paybytouch.com>
Subject RE: suggestion for new StringUtils.method
Date Tue, 11 Nov 2003 18:31:20 GMT
Take a look at the following bug entry currently in Bugzilla.... Your idea seems to be an expansion
of the desired functionality described by the "bug".  I would certainly add the additional
behaviour you propose (escape characters to prevent tokenization) to this bug.  

It would be interesting, I think, if this functionality was provided as a StringTokenizer
replacement (subclass?) in line with Stephen's comment on the bug.  (This could be as simple
as a delegation to a StringUtils method, or a StringUtils convenience method could delegate
to the StringTokenizer replacement.)  

Cheers, 
-AMT

-----Original Message-----
From: Inger, Matthew [mailto:inger@Synygy.com] 
Sent: Tuesday, November 11, 2003 10:03 AM
To: 'commons-dev@jakarta.apache.org'
Subject: suggestion for new StringUtils.method


The following method might be extremely useful for people:

String [] undelimit(String input, char separatorChar, char quoteChar);

This method splits a string according to a Delimiter seperated format
(CSV) for example.  It takes into account quoting, as well as allowing for empty tokens. 
So a string like the following:

a, , b, "c,d,e", "f""g""h",

would return the following tokens:

1- a
2- <blank>
3- b
4- c,d,e
5- f"g"h
6- <blank>

It happens to strip leading whitespace, but i could always make that optional.

It's an extremely efficient algorithm which runs through the underlying character array one
character at a time to build the tokens.

Any thoughts?

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message