commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rikles <...@git.apache.org>
Subject [GitHub] commons-lang pull request: LANG-1124: Add StringUtils split by len...
Date Sat, 16 May 2015 12:30:12 GMT
Github user rikles commented on a diff in the pull request:

    https://github.com/apache/commons-lang/pull/75#discussion_r30460736
  
    --- Diff: src/main/java/org/apache/commons/lang3/StringUtils.java ---
    @@ -3277,6 +3277,164 @@ public static String substringBetween(final String str, final
String open, final
             return list.toArray(new String[list.size()]);
         }
     
    +    /**
    +     * <p>Split a String into an array, using an array of fixed string lengths.</p>
    +     *
    +     * <p>If not null String input, the returned array size is same as the input
lengths array.</p>
    +     *
    +     * <p>A null input String returns {@code null}.
    +     * A {@code null} or empty input lengths array returns an empty array.
    +     * A {@code 0} in the input lengths array results in en empty string.</p>
    +     *
    +     * <p>Extra characters are ignored (ie String length greater than sum of split
lengths).
    +     * All empty substrings other than zero length requested, are returned {@code null}.</p>
    +     *
    +     * <pre>
    +     * StringUtils.splitByLength(null, *)      = null
    +     * StringUtils.splitByLength("abc")        = []
    +     * StringUtils.splitByLength("abc", null)  = []
    +     * StringUtils.splitByLength("abc", [])    = []
    +     * StringUtils.splitByLength("", 2, 4, 1)  = [null, null, null]
    +     *
    +     * StringUtils.splitByLength("abcdefg", 2, 4, 1)     = ["ab", "cdef", "g"]
    --- End diff --
    
    Like said in the next line : `StringUtils.splitByLength("abcdefg", 2, 2)` will return
`["ab", "cd" ]`.
    `StringUtils.splitByLength("abcdefghij", 2, 4, 1)  = ["ab", "cdef", "g"]`
    
    I asked myself the question during development. Do we discard the extra characters ?
    I think it would be nice to let users decide. Moreover, depending on use case, it could
be useful to keep/discard the "first extra characters" (like parsing a single line commented
out string).
    
    I propose to :
      * add a private `splitByLengthWorker(String string, boolean splitFromEnd, boolean discardExtraChar,
int ... lengths)`
      * keep this `splitByLength(String, int ...)` method logic as default  : `return splitByLengthWorker(string,
false, true, lengths)`. So, by default, the returned array is same size as the `int ... lengths`
array param and this behavior is interesting on parsing "fixed column lengths" strings.
      * add a `splitByLengthKeepExtraChar(String, int ...)` : `return splitByLengthWorker(string,
false, false, lengths)`
      * add a `splitByLengthFromEnd(String, int ...)` : `return splitByLengthWorker(string,
true, false, lengths)`
      * add a `splitByLengthFromEndKeepExtraChar(String, int ...)` : `return splitByLengthWorker(string,
true, true, lengths)`
    
    A question : For _split from end_ methods, which call do you think is more logic : _right
aligned/end to start_ lengths, _reversed/not reversed_ result ?
      * `StringUtils.splitByLengthFromEndKeepExtraChar("__abcdef", 1, 2, 3)  = ["__", "a",
"bc", "def"]` - (RA, NR)
      * `StringUtils.splitByLengthFromEndKeepExtraChar("__abcdef", 1, 2, 3)  = ["def", "bc",
"a", "__"]` - (RA, R)
      * `StringUtils.splitByLengthFromEndKeepExtraChar("__abcdef", 1, 2, 3)  = ["f", "de",
"abc", "__"]` - (E2S, R)
      * `StringUtils.splitByLengthFromEndKeepExtraChar("__abcdef", 1, 2, 3)  = ["__", "abc",
"de", "f"]` - (E2S, NR)
    
    I think the first one is more readable, we can visually understand the splitting, but
may be less intuitive :
    ```
    StringUtils.splitByLengthFromEnd("ABCDEFGHIJKLM", 3, 4, 5)  = ["BCD", "EFGH", "IJKLM"]
     [3][4_][_5_]
    ABCDEFGHIJKLM
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message