lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-2583) Make external scoring more efficient (ExternalFileField, FileFloatSource)
Date Thu, 09 Jun 2011 20:20:58 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13046785#comment-13046785
] 

Robert Muir commented on SOLR-2583:
-----------------------------------

bq. Though, as we had 4GB taken by FileFloatSource objects a reduction to 1/4 would still
be too much for us so for our case I prefer the map based approach - then with Smallfloat.

If the problem is sparsity, maybe use a two-stage table, still faster than a hashmap and much
better for the worst case.


> Make external scoring more efficient (ExternalFileField, FileFloatSource)
> -------------------------------------------------------------------------
>
>                 Key: SOLR-2583
>                 URL: https://issues.apache.org/jira/browse/SOLR-2583
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>            Reporter: Martin Grotzke
>            Priority: Minor
>         Attachments: FileFloatSource.java.patch
>
>
> External scoring eats much memory, depending on the number of documents in the index.
The ExternalFileField (used for external scoring) uses FileFloatSource, where one FileFloatSource
is created per external scoring file. FileFloatSource creates a float array with the size
of the number of docs (this is also done if the file to load is not found). If there are much
less entries in the scoring file than there are number of docs in total the big float array
wastes much memory.
> This could be optimized by using a map of doc -> score, so that the map contains as
many entries as there are scoring entries in the external file, but not more.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message