commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 14062] New: - split using null and max less than actual token count adds "null"
Date Tue, 29 Oct 2002 19:43:09 GMT
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=14062>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=14062

split using null and max less than actual token count adds "null"

           Summary: split using null and max less than actual token count
                    adds "null"
           Product: Commons
           Version: 1.0 Final
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: Major
          Priority: Other
         Component: Lang
        AssignedTo: commons-dev@jakarta.apache.org
        ReportedBy: mcdoma@ncs.com


/*

This bug is using Commons Lang 1.0 with JDK 1.4.0_01.
The problem only occurs when using null as the separator and using a
max value which is less than the number of actual tokens and making 
use of the last token which consists of what is left of the string
being tokenized. Because null is passed, the existing code loops through
the rest of the tokens, appending null to each of them, expecting that
the result will be the remaining string, except the string "null" is 
appended rather than a real token...

*/

import org.apache.commons.lang.StringUtils;
import java.util.StringTokenizer;

/**
 * If you replace the
 *   String[] split(String str, String separator, int max) code with the
 *   fixedSplit code below, it will fix the problem.  The code below 
demonstrates the problem
 *   and the solution.
 *
 * NOTE: I have not run any JUnit tests for this - don't know how.  But - this 
demonstrates the
 * but and supplies the solution.
 *
 */

public class DemoSplitBug {
    public static final String strToSplit = "This is a test of the emergency 
broadcast system.";

    public static void main(String[] args) {
        DemoSplitBug.demoSplitBug(strToSplit);
        DemoSplitBug.demoSplitBugFix(strToSplit);
    }

    public static void demoSplitBug(String theString) {
        String[] theSplits = StringUtils.split(theString, null, 4);
        for (int i = 0; i < theSplits.length; i++) {
            String theSplit = theSplits[i];
            System.out.println("theSplit:" + theSplit);
        }
    }

    public static void demoSplitBugFix(String theString) {
        String[] theSplits = fixedSplit(theString, null, 4);
        for (int i = 0; i < theSplits.length; i++) {
            String theSplit = theSplits[i];
            System.out.println("theSplit:" + theSplit);
        }
    }

    public static String[] fixedSplit(String str, String separator, int max) {
        StringTokenizer tok = null;
        if (separator == null) {
            // Null separator means we're using StringTokenizer's default
            // delimiter, which comprises all whitespace characters.
            tok = new StringTokenizer(str);
        } else {
            tok = new StringTokenizer(str, separator);
        }

        int listSize = tok.countTokens();
        if (max > 0 && listSize > max) {
            listSize = max;
        }

        String[] list = new String[listSize];
        int i = 0;
        int lastTokenBegin = 0;
        int lastTokenEnd = 0;
        while (tok.hasMoreTokens()) {
            if (max > 0 && i == listSize - 1) {
                // In the situation where we hit the max yet have
                // tokens left over in our input, the last list
                // element gets all remaining text.
                String endToken = tok.nextToken();
                lastTokenBegin = str.indexOf(endToken, lastTokenEnd);
                list[i] = str.substring(lastTokenBegin);
                break;
            } else {
                list[i] = tok.nextToken();
                lastTokenBegin = str.indexOf(list[i], lastTokenBegin);
                lastTokenEnd = lastTokenBegin + list[i].length();
            }
            i++;
        }
        return list;
    }
}

--
To unsubscribe, e-mail:   <mailto:commons-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:commons-dev-help@jakarta.apache.org>


Mime
View raw message