lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley" <ysee...@gmail.com>
Subject Re: (byte)((i & 0x7f) | 0x80) == (byte)(i | 0x80)
Date Wed, 26 Apr 2006 17:00:03 GMT
On 4/26/06, Charlie <charliecmo@gmail.com> wrote:
> I thought
>
>   (byte)((i & 0x7f) | 0x80) == (byte)(i | 0x80)
>
> As (byte) is able to truncate the last byte for us already, no need of
> (& 0x7f). If so, we may change that line to
>
>    writeByte((byte)(i | 0x80));
>
> and may speed up a little bit. Correct me if (i & 0x7f) is necessary.
> Thank you.

I wouldn't bother optimizing these methods... I think they will be
changed in the future anyway.
1) The current code outputs modified-UTF-8 instead of true UTF-8
2) I think we may be going to byte-oriented counts for length (away
from number of java chars, which are variable-length with the latest
unicode standards)

Marvin Humphrey has done the first, and seems close to finishing #2.

http://www.mail-archive.com/java-dev@lucene.apache.org/msg01970.html
http://www.mail-archive.com/java-dev@lucene.apache.org/msg02109.html
http://www.mail-archive.com/java-dev@lucene.apache.org/msg02468.html
http://www.mail-archive.com/java-dev@lucene.apache.org/msg03801.html

-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message