lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Hudson" <andrewhud...@gmail.com>
Subject Re: Reading Performance
Date Fri, 08 Dec 2006 22:47:54 GMT
I think I've seen this problem when you use Lucene's built in delete
mechanism,
IndexReader.deleteDocument I believe.  The problem was it was synchronizing
on a java BitSet, which totally killed performance when more than one
process
was using the same IndexReader.  Better way to do deletes is to use your own
BitSet in combination with a HitCollector and just ignore docs that are
deleted.

On 12/8/06, Aigner, Thomas <TAigner@wescodist.com> wrote:
>
> I have tried the HitsCollector and the time has improved ~ 3/4 a second
> on the searching.  I still get really bad times when two or more people
> ask for data at the same time.  The problem doesn't seem to be in
> writing the files, it's in getting data from the index when two or more
> people ask for large recordsets back (I can take all the I/O statements
> out and still see the performance bottleneck)
>
> Tom
>
>
> hc = new HitCollector() {
>         BufferedWriter fw2 = null;
>         public void collect(int id, float score) {
>                     try
>                       {
>
>                            Document doc = is.doc(id);
>                            if (fw2 == null)
>                             {
>                                 fw2 = new BufferedWriter( new
> FileWriter( "WHERETOWRITEFILE"), 8196 );
>                             }
>                             fw2.write(doc.get("field1") + "\n");
>                             fw2.flush();
>                             }
>                      catch (Exception ex) {
>
>                         }
>                    }
> };
> is.search(query, hc);
>
>
> -----Original Message-----
> From: Aigner, Thomas [mailto:TAigner@WescoDist.com]
> Sent: Thursday, December 07, 2006 1:36 PM
> To: java-user@lucene.apache.org
> Subject: RE: Reading Performance
>
> Thanks Grant and Erik for your suggestions.  I will try both of them and
> let you know if I see a marked increase in speed.
>
> Tom
>
>
> -----Original Message-----
> From: Grant Ingersoll [mailto:grant.ingersoll@gmail.com]
> Sent: Thursday, December 07, 2006 1:24 PM
> To: java-user@lucene.apache.org
> Subject: Re: Reading Performance
>
> Have you done any profiling to identify hotspots in Lucene versus
> your application?
>
> You might look into the FieldSelector code (used in IndexReader) in
> the Trunk version of Lucene could be used to only load the fields you
> are interested when getting the document from disk.  This can be
> useful if you have large fields that are being loaded that you don't
> necessarily need (thus skipping them).
>
> Also, do you need the BufferedWriter construction and check in side
> the loop?  Probably small in comparison to loading, but  It seems
> like it is only created once, why have it in the loop?
>
>
>
> On Dec 7, 2006, at 1:14 PM, Aigner, Thomas wrote:
>
> >
> >
> >
> >
> > Howdy all,
> >
> >
> >
> >       I have a question on reading many documents and time to do this.
> > I have a loop on the hits object reading a record, then writing it
> > to a
> > file.  When there is only 1 user on the Index Searcher, this
> > process to
> > read say 100,000 takes around 3 seconds.  This is slow, but can be
> > acceptable.  When a few more users do searchers, this time to just
> > read
> > from the hits object becomes well over 10 seconds, sometimes even 30+
> > seconds.  Is there a better way to read through and do something with
> > the hits information?  And yes, I have to read all of them to do this
> > particular task.
> >
> >
> >
> > for (int i = 0;(i <= hits.length() - 1); i++)
> >
> > {
> >
> >
> >
> >       if (fw == null)
> >
> >       {
> >
> >             fw = new BufferedWriter( new FileWriter
> > ( searchWriteSpec ),
> > 8196) ;
> >
> >       }
> >
> >
> >
> >       //Write Out records
> >
> >       String tmpHold = "";
> >
> > tmpHold = hits.doc(i).get("somefield1") + hits.doc(i).get
> > ("somefield2");
> >
> >
> >
> >       fw.write(tmpHold + "\n" );
> >
> >
> >
> > }
> >
> >
> >
> > Any ideas on how to speed this up especially with multiple users?
> > Each
> > user gets their own class which has the above code in it.
> >
> >
> >
> > Thanks,
> >
> > Tom
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
> ------------------------------------------------------
> Grant Ingersoll
> http://www.grantingersoll.com/
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message