lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From adfel70 <adfe...@gmail.com>
Subject Executing Collector's Collect method on more than one thread
Date Sun, 31 Jan 2016 14:13:27 GMT
I am using RankQuery to implement my applicative scorer that returns a score
based on the value of specific field (lets call it 'score_field') that is
stored for every document. 
The RankQuery creates a collector, and for every collected docId I retrieve
the value of score_field, calculate the score and add the doc id into
priority queue: 

public class MyScorerrankQuery extends RankQuery { 
        ... 

        @Override 
        public TopDocsCollector getTopDocsCollector(int i,
SolrIndexerSearcher.QueryCommand cmd, IndexSearcher searcher) { 
                ... 
                return new MyCollector(...) 
        } 
} 

public class MyCollector  extends TopDocsCollector{         
        MyScorer scorer; 
        SortedDocValues scoreFieldValues;
        

        @Override 
        public void collect(int id){ 
        	int docID = docBase + id; 
			//1. get specific field from the doc using DocValues and calculate score
using my scorer 
			String value = scoreFieldValues.get(docID).utf8ToString(); 
			scorer.calcScore(value); 
			//2. add docId and score (ScoreDoc object) into PriorityQueue. 
        } 
} 

Problem is that the calcScore may take ~20 ms per call, so if query returns
100,000 docs, which is not unusual, query execution time will be become 16
minutes. Is there a way to parallelize collector's logic, so more than one
thread would call calcScore simultaneously?



--
View this message in context: http://lucene.472066.n3.nabble.com/Executing-Collector-s-Collect-method-on-more-than-one-thread-tp4254269.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message