tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Kolinko <knst.koli...@gmail.com>
Subject Re: svn commit: r944918 - /tomcat/trunk/java/org/apache/tomcat/util/buf/ByteChunk.java
Date Sun, 16 May 2010 23:03:43 GMT
2010/5/17  <markt@apache.org>:
> Author: markt
> Date: Sun May 16 21:31:57 2010
> New Revision: 944918
>
> URL: http://svn.apache.org/viewvc?rev=944918&view=rev
> Log:
> Code clean-up
>
> Modified:
>    tomcat/trunk/java/org/apache/tomcat/util/buf/ByteChunk.java
>

> +    public static int indexOf(byte bytes[], int start, int end, char c) {
> +        return findChar(bytes, start, end, c);
>     }

There is a difference in behaviour between ByteChunk#indexOf(..) and
ByteChunk#findChar(..) how they were implemented before this change.

indexOf() relied on (byte == char) comparison, which performs widening.
findChar() does byte b=(byte)c; and compares bytes.

The difference is that indexOf() can find ASCII (0-127) characters
only, while findChar() can find any ISO-8859-1 char.

Below is a small program that demonstrates the above
[[[
public class Test {
 public static void main(String[] args){
   // code 160: &nbsp;
   char c = '\u00a0';
   byte b = (byte) c;
   System.out.println(b == c);
   System.out.println(b);

   char c2 = (char) (b & 0xFF);
   System.out.println(c2 == c);

   byte[] bytes = new byte[]{65, 66, 67, -96, 68};
   System.out.println(indexOf(bytes, 0, bytes.length, c));
   System.out.println(findChar(bytes, 0, bytes.length, c));
 }

    public static int  indexOf( byte bytes[], int off, int end, char qq )
    {
	// Works only for UTF
	while( off < end ) {
	    byte b=bytes[off];
	    if( b==qq )
		return off;
	    off++;
	}
	return -1;
    }

    /** Find a character, no side effects.
     *  @return index of char if found, -1 if not
     */
    public static int findChar( byte buf[], int start, int end, char c ) {
	byte b=(byte)c;
	int offset = start;
	while (offset < end) {
	    if (buf[offset] == b) {
		return offset;
	    }
	    offset++;
	}
	return -1;
    }
}
]]]

It prints:
false
-96
true
-1
3


This r944918 broke findChar(),
but there were actually no calls to that method: my IDE does not find
any references to the find* methods (findChar, findChars and
findNotChars) of ByteChunk in our trunk or in TC6.


Best regards,
Konstantin Kolinko

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Mime
View raw message