commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LANG-1300) Clarify or improve behaviour of int-based methods in StringUtils
Date Tue, 14 Mar 2017 12:37:41 GMT

    [ https://issues.apache.org/jira/browse/LANG-1300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15924100#comment-15924100
] 

ASF GitHub Bot commented on LANG-1300:
--------------------------------------

Github user chtompki commented on the issue:

    https://github.com/apache/commons-lang/pull/251
  
    Okay. It seems that we'll want some clearer documentation. Note that the javadoc on `indexOf`
in the jdk follows:
    
    ```java
    /**
         * Returns the index within this string of the first occurrence of the
         * specified character, starting the search at the specified index.
         * <p>
         * If a character with value <code>ch</code> occurs in the
         * character sequence represented by this <code>String</code>
         * object at an index no smaller than <code>fromIndex</code>, then
         * the index of the first such occurrence is returned. For values
         * of <code>ch</code> in the range from 0 to 0xFFFF (inclusive),
         * this is the smallest value <i>k</i> such that:
         * <blockquote><pre>
         * (this.charAt(<i>k</i>) == ch) && (<i>k</i> &gt;=
fromIndex)
         * </pre></blockquote>
         * is true. For other values of <code>ch</code>, it is the
         * smallest value <i>k</i> such that:
         * <blockquote><pre>
         * (this.codePointAt(<i>k</i>) == ch) && (<i>k</i>
&gt;= fromIndex)
         * </pre></blockquote>
         * is true. In either case, if no such character occurs in this
         * string at or after position <code>fromIndex</code>, then
         * <code>-1</code> is returned.
         *
         * <p>
         * There is no restriction on the value of <code>fromIndex</code>. If
it
         * is negative, it has the same effect as if it were zero: this entire
         * string may be searched. If it is greater than the length of this
         * string, it has the same effect as if it were equal to the length of
         * this string: <code>-1</code> is returned.
         *
         * <p>All indices are specified in <code>char</code> values
         * (Unicode code units).
         *
         * @param   ch          a character (Unicode code point).
         * @param   fromIndex   the index to start the search from.
         * @return  the index of the first occurrence of the character in the
         *          character sequence represented by this object that is greater
         *          than or equal to <code>fromIndex</code>, or <code>-1</code>
         *          if the character does not occur.
         */
    ```
    
    Let's see if we can get something along these lines on both CharSequenceUtils as well
as StringUtils. Furthermore, I think we should stick as closely to the implementation in `String`
as possible simply generalizing to `CharSequence`'s.


> Clarify or improve behaviour of int-based methods in StringUtils
> ----------------------------------------------------------------
>
>                 Key: LANG-1300
>                 URL: https://issues.apache.org/jira/browse/LANG-1300
>             Project: Commons Lang
>          Issue Type: Improvement
>          Components: lang.*
>    Affects Versions: 3.5
>            Reporter: Duncan Jones
>            Priority: Minor
>             Fix For: Discussion
>
>
> The following methods use an {{int}} to represent a search character:
> {code:java}
> boolean contains(final CharSequence seq, final int searchChar)
> int indexOf(final CharSequence seq, final int searchChar)
> int indexOf(final CharSequence seq, final int searchChar, final int startPos)
> int lastIndexOf(final CharSequence seq, final int searchChar)
> int lastIndexOf(final CharSequence seq, final int searchChar, final int startPos)
> {code}
> When I see an {{int}} representing a character, I tend to assume the method can handle
supplementary characters. However, the current behaviour of these methods depends upon whether
the {{CharSequence}} is a {{String}} or not.
> {code:java}
> StringBuilder builder = new StringBuilder();
> builder.appendCodePoint(0x2070E);
> System.out.println(StringUtils.lastIndexOf(builder, 0x2070E)); // -1
> System.out.println(StringUtils.lastIndexOf(builder.toString(), 0x2070E)); // 0
> {code}
> The Javadoc for these methods are ambiguous on this point, stating:
> {quote}
> This method uses {{String.lastIndexOf(int)}} if possible.
> {quote}
> I think we should consider updating the {{CharSequenceUtils}} methods used by this class
to convert all {{CharSequence}} parameters to strings, enabling full code point support. The
docs could be updated to make this crystal clear.
> There is a question of whether this breaks backwards compatibility.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message