lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mohammad Norouzi" <mnr...@gmail.com>
Subject Re: how to define a pool for Searcher?
Date Thu, 08 Mar 2007 05:01:35 GMT
Erick and Mark thank you very much, you really give me good information. so
I decided to try HitCollector and see how it works. but about storing
document ID I dont think it is good because the result may be exceed than 50
000 and I just were optimistic about telling that number ;)

any way, I will try them, even storing IDs.

Thank you very much again.

-- 
Regards,
Mohammad

On 3/7/07, Mark Miller <markrmiller@gmail.com> wrote:
>
> You should not be returning 50 thousand documents. Since you are
> implementing paging, you should only return enough to cover your page
> size. If a user is viewing page 1 with documents 1-10, you send back
> information for 10 of the docs. On page 2, 10-20, you send back
> information for 10 of the docs, after executing a new search and
> skimming off the correct docs.
>
> Also, unless your documents are *very* big, 500,000 documents on any
> sort of modern hardware is a cakewalk for Lucene.
>
> Using Hits as it was meant to be used is generally a poor decision in a
> multi-user environment in my opinion. You will be faster and scale
> better using a stateless paradigm instead of taking advantage of Hits
> caching. Besides that, if you are currently loading up 50 thousand
> documents from Hits, you are doing something that is extremely
> inefficient. As Erick said, Hits will continuously re-query to fill it
> ups cache. After grabbing info for a 100 docs it will re query, then
> after 200 it will require, etc.
>
> Stateless man. Trust me.
>
> - Mark
>
> Mohammad Norouzi wrote:
> > yes I am very concerned about this because we have a big project with
> > many
> > users and I am responsible for this. the thing that preoccupied my
> > mind is
> > application performance because there is more than 500 thousands records
> > (documents).
> >
> > a single search may returns about 50 thousand documents and it is not
> > possible to put all of them into a say java.util.List  so I have to
> > keep the
> > searcher open and move forward or backward through the Hits and when the
> > user clicked on "Finish" button or the time exceeds over than a specific
> > time,(or the session destroyed)  so I set a flag to true, then the other
> > session can access that searcher without closing any searcher or reader.
> >
> > any way, your comments are useful for me.
> > thanks
> >
> > On 3/7/07, Mark Miller <markrmiller@gmail.com> wrote:
> >>
> >> To address your hits question: I wouldn't keep hits around, but would
> >> re-search instead. It is often more of a headache than a time savings
> to
> >> keep around all of the Hits objects and to have to manage them. I made
> >> my own Hits object that does no caching because of this. Pagination is
> >> often best done by re-querying.
> >>
> >> Also, keep in mind that you prob won't have 1000 Similarities...you
> will
> >> prob have much closer to 1 <g>, maybe a couple if you have created a
> >> custom one. The biggest chance you have more than one Searcher cached
> >> for an Index is if you have a MultiSearcher cached that searches over
> >> it. Out of the box, indexAccessor does not handle MultiSearchers
> >> perfectly though...it does not check out a Searcher for each of the
> >> underlying Indexes, so you will have to do that your self...then
> >> remember to release them all when you release the MultiSearcher.
> >>
> >> I think in general, you are over concerned. IndexAccessor will handle
> >> most of this for you without much intervention on your part.
> >>
> >> - Mark
> >>
> >> Mohammad Norouzi wrote:
> >> > Hello Mark,
> >> > there is something vague for me about the Lucene-indexAccessor you
> >> > created
> >> > and my problem.
> >> > as I see your codes, you create IndexSearcher and put it into a Map
> >> > and the
> >> > only thing that separate them is the Similarity the have. so if say
> >> 1000
> >> > users with different Similarity connect to my application there will
> >> > be 1000
> >> > IndexSearcher with their own internal Reader.
> >> > now, in my case, I have an IndexResultSet just like
> java.sql.ResultSet
> >> > which
> >> > it contains a Hits. so a user may go forward or backward through the
> >> > Hits'
> >> > documents and actually every user are doing this job.
> >> >
> >> > to do so, I have to find the Similarity that a user working with it
> >> > and find
> >> > the right IndexSearcher in order to support pagination for her. is
> >> this
> >> > right? I mean can I trust to Similarity to find the right IndexReader
> >> > that a
> >> > user have used it before?
> >> >
> >> > another question is, how about I have one IndexReader for all my
> >> > IndexSearcher and manage them simultaneously to access that single
> >> > Reader.?
> >> >
> >> > thank you very much in advance
> >> >
> >> >
> >> > On 2/22/07, Mark Miller <markrmiller@gmail.com> wrote:
> >> >>
> >> >> I would not do this from scratch...if you are interested in Solr go
> >> that
> >> >> route else I would build off
> >> >> http://issues.apache.org/jira/browse/LUCENE-390
> >> >>
> >> >> - Mark
> >> >>
> >> >> Mohammad Norouzi wrote:
> >> >> > Hi all,
> >> >> > I am going to build a Searcher pooling. if any one has
> >> experience on
> >> >> > this, I
> >> >> > would be glad to hear his/her recommendation and suggestion. I
> want
> >> to
> >> >> > know
> >> >> > what issues I should be apply. considering I am going to use
> >> this on
> >> a
> >> >> > web
> >> >> > application with many user sessions.
> >> >> >
> >> >> > thank you very much in advance.
> >> >>
> >> >>
> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >> >>
> >> >>
> >> >
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message