lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Hall <mh...@informatics.jax.org>
Subject Re: How to improve search time?
Date Tue, 04 Aug 2009 12:55:25 GMT
Also, how long does it take Luke to do a search against the same index.

That way you can remove any of the timing that your application is 
adding into the mix.

If Luke doesn't take the minimum of 8 seconds... then you know its an 
issue with your app.  (or at least a large part of it)

Matt

Ian Lea wrote:
> Still surprising that your searches are taking so long.
>
> Have you worked through everything on
> http://wiki.apache.org/lucene-java/ImproveSearchingSpeed, suggested by
> someone earlier in this thread?  Are you sure that the problem is
> really with lucene? Is it the search itself that takes a long time, or
> retrieving data for the hits?  What does query.toString() look like?
> How many hits does a search typically match?  Is a search on document
> id effectively instant?
>
> You have to supply more detail if you want better answers.
>
>
> --
> Ian.
>
>
> On Tue, Aug 4, 2009 at 12:21 PM, prashant
> ullegaddi<prashullegaddi@gmail.com> wrote:
>   
>> Shahi,
>>
>> Our queries are free text queries. But they will be expanded into:
>> Multifield, Boolean.
>> We are also expanding the original query using SynExpand of lucene. A simple
>> query
>> gets expanded to say a query of page size.
>>
>> And we are not storing any other fields except key (document IDs), target
>> URLs and titles.
>>
>> Prashant.
>>
>> On Tue, Aug 4, 2009 at 1:31 PM, Shashi Kant <shashi.mit@gmail.com> wrote:
>>
>>     
>>> Prashant, I have had better luck with even larger sized indices on
>>> similar platforms. Could you elaborate what types of queries you are
>>> running, Multifield? Boolean? combinations? etc. Also you might want
>>> to remove unnecessary stored fields from the index and move them to a
>>> relational db to squeeze out better performance.
>>>
>>>
>>> Shashi
>>>
>>>
>>> On Tue, Aug 4, 2009 at 3:18 AM, prashant
>>> ullegaddi<prashullegaddi@gmail.com> wrote:
>>>       
>>>> I did that as well. Actually, we had 32 indexes initially. We searched
>>>>         
>>> them.
>>>       
>>>> It was even horrible.
>>>> After that I merged them into 4 indexes. And did the same. No gain!
>>>>
>>>> Then, I had to merge 32 indexes into one.
>>>>
>>>> On Tue, Aug 4, 2009 at 10:48 AM, Anshum <anshumg@gmail.com> wrote:
>>>>
>>>>         
>>>>> Hi Prashant,
>>>>> 8 seconds as the minimum time is a little too much, though considering
>>>>> you're using just 4G of RAM its still ok.
>>>>> I would advice you to break your index into smaller indexes, perhaps
>>>>> selectively query the indexes (if that's possible for your application)
>>>>>           
>>> and
>>>       
>>>>> use a parallelmultisearcher. Its just something that you might try and
>>>>> like.
>>>>> All said and done, parallelizing would only get you a bell-curve like
>>>>> performance graph, so you'd have to figure out the sweet spot there.
>>>>>
>>>>> --
>>>>> Anshum Gupta
>>>>> Naukri Labs!
>>>>> http://ai-cafe.blogspot.com
>>>>>
>>>>> The facts expressed here belong to everybody, the opinions to me. The
>>>>> distinction is yours to draw............
>>>>>
>>>>>
>>>>> On Tue, Aug 4, 2009 at 10:08 AM, prashant ullegaddi <
>>>>> prashullegaddi@gmail.com> wrote:
>>>>>
>>>>>           
>>>>>> I'm running it on Quadcore, 2.4GHz each, 4GB RAM.
>>>>>>
>>>>>> Prashant.
>>>>>>
>>>>>> On Tue, Aug 4, 2009 at 8:38 AM, Otis Gospodnetic <
>>>>>> otis_gospodnetic@yahoo.com
>>>>>>             
>>>>>>> wrote:
>>>>>>>               
>>>>>>> With such a large index be prepared to put it on a server with
lots
>>>>>>>               
>>> of
>>>       
>>>>>> RAM
>>>>>>             
>>>>>>> (even if you follow all the tips from the Wiki).
>>>>>>> When reporting performance numbers, you really ought to tell
us
>>>>>>>               
>>> about
>>>       
>>>>>> your
>>>>>>             
>>>>>>> hardware, types of queries, etc.
>>>>>>>
>>>>>>> Otis
>>>>>>> --
>>>>>>> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
>>>>>>> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ----- Original Message ----
>>>>>>>               
>>>>>>>> From: prashant ullegaddi <prashullegaddi@gmail.com>
>>>>>>>> To: java-user@lucene.apache.org
>>>>>>>> Sent: Monday, August 3, 2009 12:33:46 AM
>>>>>>>> Subject: How to improve search time?
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I've a single index of size 87GB containing around 50M documents.
>>>>>>>>                 
>>>>> When
>>>>>           
>>>>>> I
>>>>>>             
>>>>>>>> search for any query,
>>>>>>>> best search time I observed was 8sec. And when query is expanded
>>>>>>>>                 
>>> with
>>>       
>>>>>>>> synonyms, search takes
>>>>>>>> minutes (~ 2-3min). Is there a better way to search so that
>>>>>>>>                 
>>> overall
>>>       
>>>>>>> search
>>>>>>>               
>>>>>>>> time reduces?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Prashant.
>>>>>>>>                 
>>>>>>>
>>>>>>>               
>>> ---------------------------------------------------------------------
>>>       
>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>
>>>>>>>
>>>>>>>               
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>>       
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>   


-- 
Matthew Hall
Software Engineer
Mouse Genome Informatics
mhall@informatics.jax.org
(207) 288-6012



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message