lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <>
Subject Re: Lucene does NOT use UTF-8.
Date Mon, 29 Aug 2005 21:23:53 GMT
Ken Krugler wrote:
> The remaining issue is dealing with old-format indexes.

I think that revving the version number on the segments file would be a 
good start.  This file must be read before any others.  Its current 
version is -1 and would become -2.  (All positive values are version 0, 
for back-compatibility.)  Implementations can be modified to pass the 
version around if they wish to be back-compatible, or they can simply 
throw exceptions for old format indexes.

I would argue that the length written be the number of characters in the 
string, rather than the number of bytes written, since that can minimize 
string memory allocations.

> I'm going to take this off-list now [ ... ]

Please don't.  It's better to have a record of the discussion.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message