lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <ian....@gmail.com>
Subject Re: How to improve search time?
Date Tue, 04 Aug 2009 11:31:09 GMT
Still surprising that your searches are taking so long.

Have you worked through everything on
http://wiki.apache.org/lucene-java/ImproveSearchingSpeed, suggested by
someone earlier in this thread?  Are you sure that the problem is
really with lucene? Is it the search itself that takes a long time, or
retrieving data for the hits?  What does query.toString() look like?
How many hits does a search typically match?  Is a search on document
id effectively instant?

You have to supply more detail if you want better answers.


--
Ian.


On Tue, Aug 4, 2009 at 12:21 PM, prashant
ullegaddi<prashullegaddi@gmail.com> wrote:
> Shahi,
>
> Our queries are free text queries. But they will be expanded into:
> Multifield, Boolean.
> We are also expanding the original query using SynExpand of lucene. A simple
> query
> gets expanded to say a query of page size.
>
> And we are not storing any other fields except key (document IDs), target
> URLs and titles.
>
> Prashant.
>
> On Tue, Aug 4, 2009 at 1:31 PM, Shashi Kant <shashi.mit@gmail.com> wrote:
>
>> Prashant, I have had better luck with even larger sized indices on
>> similar platforms. Could you elaborate what types of queries you are
>> running, Multifield? Boolean? combinations? etc. Also you might want
>> to remove unnecessary stored fields from the index and move them to a
>> relational db to squeeze out better performance.
>>
>>
>> Shashi
>>
>>
>> On Tue, Aug 4, 2009 at 3:18 AM, prashant
>> ullegaddi<prashullegaddi@gmail.com> wrote:
>> > I did that as well. Actually, we had 32 indexes initially. We searched
>> them.
>> > It was even horrible.
>> > After that I merged them into 4 indexes. And did the same. No gain!
>> >
>> > Then, I had to merge 32 indexes into one.
>> >
>> > On Tue, Aug 4, 2009 at 10:48 AM, Anshum <anshumg@gmail.com> wrote:
>> >
>> >> Hi Prashant,
>> >> 8 seconds as the minimum time is a little too much, though considering
>> >> you're using just 4G of RAM its still ok.
>> >> I would advice you to break your index into smaller indexes, perhaps
>> >> selectively query the indexes (if that's possible for your application)
>> and
>> >> use a parallelmultisearcher. Its just something that you might try and
>> >> like.
>> >> All said and done, parallelizing would only get you a bell-curve like
>> >> performance graph, so you'd have to figure out the sweet spot there.
>> >>
>> >> --
>> >> Anshum Gupta
>> >> Naukri Labs!
>> >> http://ai-cafe.blogspot.com
>> >>
>> >> The facts expressed here belong to everybody, the opinions to me. The
>> >> distinction is yours to draw............
>> >>
>> >>
>> >> On Tue, Aug 4, 2009 at 10:08 AM, prashant ullegaddi <
>> >> prashullegaddi@gmail.com> wrote:
>> >>
>> >> > I'm running it on Quadcore, 2.4GHz each, 4GB RAM.
>> >> >
>> >> > Prashant.
>> >> >
>> >> > On Tue, Aug 4, 2009 at 8:38 AM, Otis Gospodnetic <
>> >> > otis_gospodnetic@yahoo.com
>> >> > > wrote:
>> >> >
>> >> > > With such a large index be prepared to put it on a server with
lots
>> of
>> >> > RAM
>> >> > > (even if you follow all the tips from the Wiki).
>> >> > > When reporting performance numbers, you really ought to tell us
>> about
>> >> > your
>> >> > > hardware, types of queries, etc.
>> >> > >
>> >> > > Otis
>> >> > > --
>> >> > > Sematext is hiring -- http://sematext.com/about/jobs.html?mls
>> >> > > Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>> >> > >
>> >> > >
>> >> > >
>> >> > > ----- Original Message ----
>> >> > > > From: prashant ullegaddi <prashullegaddi@gmail.com>
>> >> > > > To: java-user@lucene.apache.org
>> >> > > > Sent: Monday, August 3, 2009 12:33:46 AM
>> >> > > > Subject: How to improve search time?
>> >> > > >
>> >> > > > Hi,
>> >> > > >
>> >> > > > I've a single index of size 87GB containing around 50M documents.
>> >> When
>> >> > I
>> >> > > > search for any query,
>> >> > > > best search time I observed was 8sec. And when query is expanded
>> with
>> >> > > > synonyms, search takes
>> >> > > > minutes (~ 2-3min). Is there a better way to search so that
>> overall
>> >> > > search
>> >> > > > time reduces?
>> >> > > >
>> >> > > > Thanks,
>> >> > > > Prashant.
>> >> > >
>> >> > >
>> >> > >
>> ---------------------------------------------------------------------
>> >> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> > > For additional commands, e-mail: java-user-help@lucene.apache.org
>> >> > >
>> >> > >
>> >> >
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message