lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <>
Subject Re: Sorting with little memory: A suggestion
Date Fri, 19 Mar 2010 17:57:24 GMT
On Fri, Mar 19, 2010 at 1:46 PM, Toke Eskildsen <> wrote:
> From: Robert Muir []:
>> Toke, only partially-on-topic here, is it possible to describe your
>> use-case a little more where its preferable to use this Locale-based
>> sort instead of indexing collation keys (e.g. you have to support so
>> many locales this would be too much indexing overhead?)
> My original use case was to avoid the memory overhead: Looking at our current index,
we have ~7.5M documents with ~7M unique titles. They take up about 362MB as UTF-8 bytes, which
translates to a neat 1GB of RAM as Java Strings. That's 1GB less heap for other stuff for
us, plus a sort is fairly slow. Indexing collation keys only helps with the speed problem.

I don't really understand this measurement, collation keys are
byte[]... (although its true we don't yet encode them this way in
flex, I think we should)

Robert Muir

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message