commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Knapp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LANG-860) String split with an escape pattern
Date Sat, 24 Nov 2012 23:26:58 GMT

    [ https://issues.apache.org/jira/browse/LANG-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13503436#comment-13503436
] 

Michael Knapp commented on LANG-860:
------------------------------------

I beg to differ, commons-csv assumes there can be an escape character, my code assumes there
can be an escape pattern.  My code handles a much more broad range of problems than CSV. 
For example, what if you want to get all the parenthesized text out of a document?  commons-csv
cannot do that because '(' and ')' are different characters.  Commons-csv offers no method
to retain delimiters that you split on, my code does.  Let's say you split on the pattern
of open and closed parentheses: no existing split function in commons-lang, and no function
in commons-csv, is able to retain the text that matched your delimiter, but my code does.
 The code I wrote does not replace commons-csv, nor does it try.  Commons-csv handles comments,
empty lines, trimming text, and a whole lot more which is out of the scope of my code.  Also,
if you expect anybody to use commons-csv, you should really put it on the central maven repository,
and document it a little more.
                
> String split with an escape pattern
> -----------------------------------
>
>                 Key: LANG-860
>                 URL: https://issues.apache.org/jira/browse/LANG-860
>             Project: Commons Lang
>          Issue Type: Improvement
>          Components: lang.*
>            Reporter: Michael Knapp
>            Priority: Minor
>              Labels: patch, split
>         Attachments: StringUtilsSplitEscapingly.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Often times there are strings which are delimited, but certain patterns can escape the
delimiter.  For example, quotes are used in CSV to escape a comma delimiter.  I have written
a couple methods for StringUtils that split strings while considering the possibility of an
escape pattern.  For example, when given "a,\"b,c\",c", it will produce {"a","\"b,c\"","c"}.
 In my code, the delimiter can be a string, and it can be escaped by any regular expression
pattern.  Unit tests are already written and passing.
> I plan to attach the patch for this once the ticket is created.  I just need a committer
to review the patch, approve, and commit it for me.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message