lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Toke Eskildsen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2369) Locale-based sort by field with low memory overhead
Date Sun, 02 May 2010 23:35:57 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12863219#action_12863219
] 

Toke Eskildsen commented on LUCENE-2369:
----------------------------------------

Earwin: I've spend some time looking through trunk and happily the access-by-ordinal is part
of it. This seems to make it possible to do the low-mem sort trick without the invasive surgery
to the IndexReaders. However, I find it hard to do totally clean: The FieldCache.DEFAULT is
used throughout Lucene but is hardwired for specific types of Terms. As it would be best to
have purging of closed IndexReaders done automatically, some sort of extension of the DEFAULT
is needed. Making a wrapper FieldCache that encapsulates the existing DEFAULT and replaces
it should work and since the DEFAULT is public static, this might even be the intended way
of doing it? This would change LUCENE-2369 from a large patch for Lucene core to a standard
contrib.

Thanks for the heads-up, Earwin.

> Locale-based sort by field with low memory overhead
> ---------------------------------------------------
>
>                 Key: LUCENE-2369
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2369
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Search
>            Reporter: Toke Eskildsen
>            Priority: Minor
>
> The current implementation of locale-based sort in Lucene uses the FieldCache which keeps
all sort terms in memory. Beside the huge memory overhead, searching requires comparison of
terms with collator.compare every time, making searches with millions of hits fairly expensive.
> This proposed alternative implementation is to create a packed list of pre-sorted ordinals
for the sort terms and a map from document-IDs to entries in the sorted ordinals list. This
results in very low memory overhead and faster sorted searches, at the cost of increased startup-time.
As the ordinals can be resolved to terms after the sorting has been performed, this approach
supports fillFields=true.
> This issue is related to https://issues.apache.org/jira/browse/LUCENE-2335 which contain
previous discussions on the subject.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message