commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Knapp (JIRA)" <>
Subject [jira] [Commented] (LANG-860) String split with an escape pattern
Date Sat, 24 Nov 2012 23:26:58 GMT


Michael Knapp commented on LANG-860:

I beg to differ, commons-csv assumes there can be an escape character, my code assumes there
can be an escape pattern.  My code handles a much more broad range of problems than CSV. 
For example, what if you want to get all the parenthesized text out of a document?  commons-csv
cannot do that because '(' and ')' are different characters.  Commons-csv offers no method
to retain delimiters that you split on, my code does.  Let's say you split on the pattern
of open and closed parentheses: no existing split function in commons-lang, and no function
in commons-csv, is able to retain the text that matched your delimiter, but my code does.
 The code I wrote does not replace commons-csv, nor does it try.  Commons-csv handles comments,
empty lines, trimming text, and a whole lot more which is out of the scope of my code.  Also,
if you expect anybody to use commons-csv, you should really put it on the central maven repository,
and document it a little more.
> String split with an escape pattern
> -----------------------------------
>                 Key: LANG-860
>                 URL:
>             Project: Commons Lang
>          Issue Type: Improvement
>          Components: lang.*
>            Reporter: Michael Knapp
>            Priority: Minor
>              Labels: patch, split
>         Attachments: StringUtilsSplitEscapingly.patch
>   Original Estimate: 1h
>  Remaining Estimate: 1h
> Often times there are strings which are delimited, but certain patterns can escape the
delimiter.  For example, quotes are used in CSV to escape a comma delimiter.  I have written
a couple methods for StringUtils that split strings while considering the possibility of an
escape pattern.  For example, when given "a,\"b,c\",c", it will produce {"a","\"b,c\"","c"}.
 In my code, the delimiter can be a string, and it can be escaped by any regular expression
pattern.  Unit tests are already written and passing.
> I plan to attach the patch for this once the ticket is created.  I just need a committer
to review the patch, approve, and commit it for me.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message