lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: how to define a pool for Searcher?
Date Wed, 07 Mar 2007 13:35:30 GMT
You may not be able to store all the documents, but what about
just storing the document IDs in a list?

And remember that a Hits object re-queries the index every 100
documents or so when you iterate through it, so if you're really
using a Hits object, you're re-executing the query anyway.

You might want to think about using a HitCollector, or TopDocs,
or... and recording the document IDs (which is very fast, and
holding 50,000 integers isn't very expensive).

If you're paging through the hits using a Hits object, I believe
you'll find no effective difference between keeping the
searcher open and navigating through the list and
re-executing the query each time.

If this is true, you'll see a dramatic difference by using, say,
TopDocs when you're paging into the latter parts of the list.

And you'll see an *enormous* difference if you cache the doc IDs
for your paging <G>...

Of course, if you're not using a Hits object most of this doesn't
count.

Best
Erick

On 3/7/07, Mohammad Norouzi <mnrz57@gmail.com> wrote:
>
> yes I am very concerned about this because we have a big project with many
> users and I am responsible for this. the thing that preoccupied my mind is
> application performance because there is more than 500 thousands records
> (documents).
>
> a single search may returns about 50 thousand documents and it is not
> possible to put all of them into a say java.util.List  so I have to keep
> the
> searcher open and move forward or backward through the Hits and when the
> user clicked on "Finish" button or the time exceeds over than a specific
> time,(or the session destroyed)  so I set a flag to true, then the other
> session can access that searcher without closing any searcher or reader.
>
> any way, your comments are useful for me.
> thanks
>
> On 3/7/07, Mark Miller <markrmiller@gmail.com> wrote:
> >
> > To address your hits question: I wouldn't keep hits around, but would
> > re-search instead. It is often more of a headache than a time savings to
> > keep around all of the Hits objects and to have to manage them. I made
> > my own Hits object that does no caching because of this. Pagination is
> > often best done by re-querying.
> >
> > Also, keep in mind that you prob won't have 1000 Similarities...you will
> > prob have much closer to 1 <g>, maybe a couple if you have created a
> > custom one. The biggest chance you have more than one Searcher cached
> > for an Index is if you have a MultiSearcher cached that searches over
> > it. Out of the box, indexAccessor does not handle MultiSearchers
> > perfectly though...it does not check out a Searcher for each of the
> > underlying Indexes, so you will have to do that your self...then
> > remember to release them all when you release the MultiSearcher.
> >
> > I think in general, you are over concerned. IndexAccessor will handle
> > most of this for you without much intervention on your part.
> >
> > - Mark
> >
> > Mohammad Norouzi wrote:
> > > Hello Mark,
> > > there is something vague for me about the Lucene-indexAccessor you
> > > created
> > > and my problem.
> > > as I see your codes, you create IndexSearcher and put it into a Map
> > > and the
> > > only thing that separate them is the Similarity the have. so if say
> 1000
> > > users with different Similarity connect to my application there will
> > > be 1000
> > > IndexSearcher with their own internal Reader.
> > > now, in my case, I have an IndexResultSet just like java.sql.ResultSet
> > > which
> > > it contains a Hits. so a user may go forward or backward through the
> > > Hits'
> > > documents and actually every user are doing this job.
> > >
> > > to do so, I have to find the Similarity that a user working with it
> > > and find
> > > the right IndexSearcher in order to support pagination for her. is
> this
> > > right? I mean can I trust to Similarity to find the right IndexReader
> > > that a
> > > user have used it before?
> > >
> > > another question is, how about I have one IndexReader for all my
> > > IndexSearcher and manage them simultaneously to access that single
> > > Reader.?
> > >
> > > thank you very much in advance
> > >
> > >
> > > On 2/22/07, Mark Miller <markrmiller@gmail.com> wrote:
> > >>
> > >> I would not do this from scratch...if you are interested in Solr go
> > that
> > >> route else I would build off
> > >> http://issues.apache.org/jira/browse/LUCENE-390
> > >>
> > >> - Mark
> > >>
> > >> Mohammad Norouzi wrote:
> > >> > Hi all,
> > >> > I am going to build a Searcher pooling. if any one has experience
> on
> > >> > this, I
> > >> > would be glad to hear his/her recommendation and suggestion. I want
> > to
> > >> > know
> > >> > what issues I should be apply. considering I am going to use this
> on
> > a
> > >> > web
> > >> > application with many user sessions.
> > >> >
> > >> > thank you very much in advance.
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >> For additional commands, e-mail: java-user-help@lucene.apache.org
> > >>
> > >>
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
>
> --
> Regards,
> Mohammad
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message