lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: lucene 2.9.0RC4 slower than 2.4.1?
Date Wed, 16 Sep 2009 17:13:07 GMT
Notice that while DisjunctionScorer.advance and
DisjuntionScorer.advanceAfterCurrent appear to be called
in 2.9, in 2.4, I am only seeing DisjuntionScorer.advanceAfterCurrent
called.

Can someone explain that?

Mark Miller wrote:
> Something is very odd about this if they both cover the same search and
> the environ for both is identical. Even if one search was done twice,
> and we divide the numbers for the new api by 2 - its still *very* odd.
>
> With 2.4, ScorerDocQueue.topDoc is called half a million times.
> With 2.9, its called over 4 million times.
>
> Huh?
>
> Thomas Becker wrote:
>   
>> No it's only a single segment. But two calls. One doing a getHitsCount first and
>> the other doing the actual search. I'll paste both methods below if someone's
>> interested.
>>
>> Will dig into lucene's sources and compare 2.4 search behaviour for my case with
>> 2.9 tomorrow. It was about time to get more into lucene-core sources anyhow. :)
>>
>> See you tomorrow guys and thanks a lot again! It's a pleasure.
>>
>> 	public int getHitsCount(String query, Filter filter) throws
>> LuceneServiceException {
>> 		log.debug("getHitsCount('{}, {}')", query, filter);
>> 		if (StringUtils.isBlank(query)) {
>> 			log.warn("getHitsCount: empty lucene query");
>> 			return 0;
>> 		}
>> 		long startTimeMillis = System.currentTimeMillis();
>> 		int count = 0;
>>
>> 		if (indexSearcher == null) {
>> 			return 0;
>> 		}
>>
>> 		BooleanQuery.setMaxClauseCount(MAXCLAUSECOUNT);
>> 		Query q = null;
>> 		try {
>> 			q = createQuery(query);
>> 			TopScoreDocCollector tsdc = TopScoreDocCollector.create(1, true);
>> 			indexSearcher.search(q, filter, tsdc);
>> 			count = tsdc.getTotalHits();
>> 			log.info("getHitsCount: count = {}",count);
>> 		} catch (ParseException ex) {
>> 			throw new LuceneServiceException("invalid lucene query:" + query, ex);
>> 		} catch (IOException e) {
>> 			throw new LuceneServiceException(" indexSearcher could be corrupted", e);
>> 		} finally {
>> 			long durationMillis = System.currentTimeMillis() - startTimeMillis;
>> 			if (durationMillis > slowQueryLimit) {
>> 				log.warn("getHitsCount: Slow query: {} ms, query={}", durationMillis, query);
>> 			}
>> 			log.debug("getHitsCount: query took {} ms", durationMillis);
>> 		}
>> 		return count;
>> 	}
>>
>> 	public List<Document> search(String query, Filter filter, Sort sort, int from,
>> int size) throws LuceneServiceException {
>> 		log.debug("{} search('{}', {}, {}, {}, {})", new Object[] { indexAlias, query,
>> filter, sort, from, size });
>> 		long startTimeMillis = System.currentTimeMillis();
>>
>> 		List<Document> docs = new ArrayList<Document>();
>> 		if (indexSearcher == null) {
>> 			return docs;
>> 		}
>> 		Query q = null;
>> 		try {
>> 			if (query == null) {
>> 				log.warn("search: lucene query is null...");
>> 				return docs;
>> 			}
>> 			q = createQuery(query);
>> 			BooleanQuery.setMaxClauseCount(MAXCLAUSECOUNT);
>> 			if (size < 0 || size > maxNumHits) {
>> 				// set hard limit for numHits
>> 				size = maxNumHits;
>> 				if (log.isDebugEnabled())
>> 					log.debug("search: Size set to hardlimit: {} for query: {} with filter:
>> {}", new Object[] { size, query, filter });
>> 			}
>> 			TopFieldCollector collector = TopFieldCollector.create(sort, size + from,
>> true, false, false, true);
>> 			indexSearcher.search(q, filter, collector);
>> 			if(size > collector.getTotalHits())
>> 				size = collector.getTotalHits();
>> 			if (size > 100000)
>> 				log.info("search: size: {} bigger than 100.000 for query: {} with filter:
>> {}", new Object[] { size, query, filter });
>> 			TopDocs td = collector.topDocs(from, size);
>> 			ScoreDoc[] scoreDocs = td.scoreDocs;
>> 			for (ScoreDoc scoreDoc : scoreDocs) {
>> 				docs.add(indexSearcher.doc(scoreDoc.doc));
>> 			}
>> 		} catch (ParseException e) {
>> 			log.warn("search: ParseException: {}", e.getMessage());
>> 			if (log.isDebugEnabled())
>> 				log.warn("search: ParseException: ", e);
>> 			return Collections.emptyList();
>> 		} catch (IOException e) {
>> 			log.warn("search: IOException: ", e);
>> 			return Collections.emptyList();
>> 		} finally {
>> 			long durationMillis = System.currentTimeMillis() - startTimeMillis;
>> 			if (durationMillis > slowQueryLimit) {
>> 				log.warn("search: Slow query: {} ms, query={}, indexUsed={}",
>> 						new Object[] { durationMillis, query,
>> indexSearcher.getIndexReader().directory() });
>> 			}
>> 			log.debug("search: query took {} ms", durationMillis);
>> 		}
>> 		return docs;
>> 	}
>>
>>
>> Uwe Schindler wrote:
>>   
>>     
>>>>> http://ankeschwarzer.de/tmp/lucene_29_newapi_mmap_singlereq.png
>>>>>
>>>>> Have to verify that the last one is not by accident more than one
>>>>>         
>>>>>           
>>>> request.
>>>>       
>>>>         
>>>>> Will
>>>>> do the run again and then post the required info.
>>>>>         
>>>>>           
>>>> The last figure shows, that IndexSearcher.searchWithFilter was called
>>>> twice
>>>> in contrast to the first figure, where IndexSearcher.search was called
>>>> only
>>>> once.
>>>>       
>>>>         
>>> I forgot, searchWithFilter it is called per segment in 2.9. If it was only
>>> one search, you must have two segments and therefore no optimized index for
>>> this to be correct?
>>>
>>> Uwe
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>     
>>>       
>>   
>>     
>
>
>   


-- 
- Mark

http://www.lucidimagination.com




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message