lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: svn commit: r883088 - in /lucene/java/branches/flex_1458/src/java/org/apache/lucene/index: TermRef.java codecs/standard/StandardTermsDictReader.java
Date Sun, 22 Nov 2009 21:19:36 GMT
On Sun, Nov 22, 2009 at 4:16 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> On Sun, Nov 22, 2009 at 4:06 PM, Robert Muir <rcmuir@gmail.com> wrote:
> > I guess here is where I just say that unicode and java are optimized for
> > utf-16 processing
>
> I agree, though leaving things as UTF8 works fine for low level stuff
> (sorting, comparing equality, etc.)?
>

+1


>
> > and so while I agree with byte[] being available in
> > places like this for flex indexing,
> > I'm already nervous about seeing code / optimizations that only work well
> > with latin-1, and are very slow / buggy for anything else.
>
> Buggy we should clearly outright fix.
>
> Slower, maybe.  But very slow, I hope not?
>
> What places specifically are you worried about?
>

places like AutomatonQuery, where I found myself wanting to consider the
option of processing byte[], when I know this is very bad!


> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>


-- 
Robert Muir
rcmuir@gmail.com

Mime
View raw message