lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Exception while doing sorting
Date Tue, 23 Sep 2008 12:39:57 GMT
That still seems excessive. Are you measuring your first sort? Lucene
builds up caches to help sort with the first few *sorts* that happen, so
that's a possibility.

But if that isn't the case, I think you need to slap a profiler on the
problem and see where you're spending your time. I'd also be careful
about what you measure when you measure your query. For instance,
I've been fooled by measuring the total time to get an assembled response
and it turned out that the time was spent fetching the documents, NOT
searching/sorting.

Try measuring various operations. In particular comment out anything having
to do with assembling the response. Perhaps just substitute in making a list
of the doc IDs and time *that*. Slowly build back up to your current app,
and
I suspect that one of the steps will cause your time to increase
dramatically.

How many documents are you assembling to respond? If you're assembling
40,000 hits, then 10-20 seconds may not be unreasonable.

Best
Erick

On Tue, Sep 23, 2008 at 12:51 AM, Ganesh - yahoo <emailgane@yahoo.co.in>wrote:

> System Specification:
> Processor speed: 2Ghz
> Ram: 3 GB
>
> IndexDB size 5 GB.
> Total documents indexed: 5.8 million.
>
> To collect hits, i have replaced Hits object with TopFieldDocs. This has
> improved the search performance better. Sorting is faster on date / long
> field, but it is very slow on string field. In a standalone application it
> took 10 - 20 secs to dispaly the results sorted on string field. [I am not
> opening indexsearcher every time].
>
> Regards
> Ganesh
>
>
>
> ----- Original Message ----- From: "Erick Erickson" <
> erickerickson@gmail.com>
> To: <java-user@lucene.apache.org>
> Sent: Monday, September 22, 2008 6:29 PM
>
> Subject: Re: Exception while doing sorting
>
>
>  Sure, your tomcat instance is assigning some amount of memory
>> to the JVM that your searcher is running in. Of course, now you're
>> going to ask me now to increase that number... I have no idea but
>> I've seen this question multiple times in the mail archive,
>> so a search there or in the tomcat docs should let you know.
>>
>> But 12 seconds is still a long time to wait for a search to complete.
>> Can you tell us more about your search?
>>
>> For instance, are you opening a searcher for each request? That's bad.
>> Are you sorting? that can take a long time, but again the first one
>> will have a performance penalty as things are cached.
>>
>> There are a number of tips here:
>> http://wiki.apache.org/lucene-java/ImproveSearchingSpeed
>>
>> Best
>> Erick
>>
>> On Mon, Sep 22, 2008 at 7:45 AM, Ganesh - yahoo <emailgane@yahoo.co.in
>> >wrote:
>>
>>  My index crossed 5 GB and 5 million documents are indexed.
>>> My query includes searching and sorting returns 40000 hits.
>>>
>>> If i do search from a standalone application, the results are returned in
>>> 12 seconds. If i perform the same from web application running inside
>>> Tomcat, out of memory exception is occured.
>>>
>>> Could any one clarify it?
>>>
>>> Regards
>>> Ganesh
>>>
>>> ----- Original Message ----- From: "Ganesh - yahoo" <
>>> emailgane@yahoo.co.in
>>> >
>>> To: <java-user@lucene.apache.org>
>>> Sent: Friday, September 19, 2008 10:56 AM
>>>
>>> Subject: Re: Exception while doing sorting
>>>
>>>
>>>  Ok. If i distribure the indexes, whether sorting would be faster?
>>>
>>>>
>>>> In Lucene user group mailing list, most emails suggests to use single
>>>> indicies. Searching across the indexes may not be slower?
>>>>
>>>>  Lucene uses FieldCache for sorting on non-tokenized field and tries to
>>>>
>>>>> maintain fields from all your 4 millions documents, even if you need
>>>>>> to sort only 4000 docs.
>>>>>>
>>>>>>  Don't know why Lucene keeps all terms in FieldCache for sorting.
It
>>>>>
>>>> supposed to sort only the hits. Please clarify?
>>>>
>>>> Regards
>>>> Ganesh
>>>>
>>>> ----- Original Message ----- From: "Otis Gospodnetic" <
>>>> otis_gospodnetic@yahoo.com>
>>>> To: <java-user@lucene.apache.org>
>>>> Sent: Thursday, September 18, 2008 12:17 PM
>>>> Subject: Re: Exception while doing sorting
>>>>
>>>>
>>>>  If your index is increasing in size so fast, you should start thinking
>>>>
>>>>> about sharding your index (breaking it into multiple smaller indices
>>>>> that
>>>>> each fits on its server) and searching across them (aka distributed
>>>>> search).
>>>>>
>>>>> Yes, Lucene can handle millions of records if run on adequate hardware
>>>>> and if used correctly.
>>>>>
>>>>> Otis
>>>>> --
>>>>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>>>>
>>>>>
>>>>>
>>>>> ----- Original Message ----
>>>>>
>>>>>  From: Ganesh - yahoo <emailgane@yahoo.co.in>
>>>>>> To: java-user@lucene.apache.org
>>>>>> Sent: Thursday, September 18, 2008 12:53:19 AM
>>>>>> Subject: Re: Exception while doing sorting
>>>>>>
>>>>>> My index is growing by 1 million records per day. How much memory
do i
>>>>>> need
>>>>>> to increase.
>>>>>>
>>>>>> What kind of sorting algorithm being used in Lucene. Is this efficient
>>>>>> enough to handle millions of records.
>>>>>>
>>>>>> Whether we could do sorting using our own algorithm?
>>>>>>
>>>>>> Regards
>>>>>> Ganesh
>>>>>>
>>>>>> ----- Original Message ----- From: "Fuad Efendi"
>>>>>> To:
>>>>>> Sent: Wednesday, September 17, 2008 7:28 PM
>>>>>> Subject: Re: Exception while doing sorting
>>>>>>
>>>>>>
>>>>>> Increase memory.
>>>>>>
>>>>>> Lucene uses FieldCache for sorting on non-tokenized field and tries
to
>>>>>> maintain fields from all your 4 millions documents, even if you need
>>>>>> to sort only 4000 docs.
>>>>>> ==============
>>>>>> http://www.tokenizer.org/bot.html
>>>>>>
>>>>>>
>>>>>> Quoting Ganesh - yahoo :
>>>>>>
>>>>>> > Hello all,
>>>>>> >
>>>>>> > I am have indexed more than 4 million documents. My query fetches
>>>>>> > 300,000 hits. If i perform sorting on any field, then tomcat
reports
>>>>>> > out of memory exception.
>>>>>> > Sometimes the query results may be around 1000, but sorting
on any
>>>>>> > field might take more than 30 - 50 secs.
>>>>>> >
>>>>>> > I don't know what's going wrong.
>>>>>> >
>>>>>> > My index searcher is static object and it is getting refreshed
every
>>>>>> > minute. JSP pages directly calls the index searcher object and
>
>>>>>> performs
>>>>>> > search.
>>>>>> >
>>>>>> > Regards
>>>>>> > Ganesh
>>>>>> >
>>>>>> >
>>>>>> > Send instant messages to your online friends
>>>>>> > http://in.messenger.yahoo.com
>>>>>> >
>>>>>> ---------------------------------------------------------------------
>>>>>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> > For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>>> Send instant messages to your online friends
>>>>>> http://in.messenger.yahoo.com
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>>
>>>>>  Send instant messages to your online friends
>>>> http://in.messenger.yahoo.com
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>>  Send instant messages to your online friends
>>> http://in.messenger.yahoo.com
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>>
>>
> Send instant messages to your online friends http://in.messenger.yahoo.com
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message