Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2135F408E for ; Mon, 27 Jun 2011 20:38:13 +0000 (UTC) Received: (qmail 18936 invoked by uid 500); 27 Jun 2011 20:38:11 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 18825 invoked by uid 500); 27 Jun 2011 20:38:10 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 18816 invoked by uid 99); 27 Jun 2011 20:38:10 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Jun 2011 20:38:10 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Jun 2011 20:38:08 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 964F0435968 for ; Mon, 27 Jun 2011 20:37:47 +0000 (UTC) Date: Mon, 27 Jun 2011 20:37:47 +0000 (UTC) From: "Martin Grotzke (JIRA)" To: dev@lucene.apache.org Message-ID: <810195608.44774.1309207067612.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <655111260.6309.1307612818974.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (SOLR-2583) Make external scoring more efficient (ExternalFileField, FileFloatSource) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/SOLR-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13055737#comment-13055737 ] Martin Grotzke commented on SOLR-2583: -------------------------------------- bq. Looking at your test, I think it is reasonable. But I'd like to use CompactByteArray. I saw it wins over HashMap and float[] when 5% and above in my test. Can you share your test code or s.th. similar? Perhaps you can just fork https://github.com/magro/lucene-solr/ and add an appropriate test that reflects your data? > Make external scoring more efficient (ExternalFileField, FileFloatSource) > ------------------------------------------------------------------------- > > Key: SOLR-2583 > URL: https://issues.apache.org/jira/browse/SOLR-2583 > Project: Solr > Issue Type: Improvement > Components: search > Reporter: Martin Grotzke > Priority: Minor > Attachments: FileFloatSource.java.patch, patch.txt > > > External scoring eats much memory, depending on the number of documents in the index. The ExternalFileField (used for external scoring) uses FileFloatSource, where one FileFloatSource is created per external scoring file. FileFloatSource creates a float array with the size of the number of docs (this is also done if the file to load is not found). If there are much less entries in the scoring file than there are number of docs in total the big float array wastes much memory. > This could be optimized by using a map of doc -> score, so that the map contains as many entries as there are scoring entries in the external file, but not more. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org