lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andreas Guther" <andreas.gut...@gmail.com>
Subject Re: Field.Store.Compress - does it improve performance of document reads?
Date Fri, 18 May 2007 04:35:53 GMT
I found a similar recommendation about the disc access and reading in order
in the following message and implemented this in my code:
http://www.gossamer-threads.com/lists/lucene/general/28268#28268

Since I am dealing with multiple index directories I sorted the document
references by index number and then by doc id.  This actually improved the
read access and reduced the read time to 50% and less.

I think this is a very interesting performance improvement tip that should
have its place in the FAQ.

Thanks for the input.

Andreas


On 5/17/07, Otis Gospodnetic <otis_gospodnetic@yahoo.com> wrote:
>
> ----- Original Message ----
> From: Paul Elschot <paul.elschot@xs4all.nl>
>
> On Thursday 17 May 2007 08:10, Andreas Guther wrote:
> > I am currently exploring how to solve performance problems I encounter
> with
> > Lucene document reads.
> >
> > We have amongst other fields one field (default) storing all searchable
> > fields.  This field can become of considerable size since we
> are  indexing
> > documents and  store the content for display within results.
> >
> > I noticed that the read can be very expensive.  I wonder now if it would
> > make sense to add this field as Field.Store.Compress to the index.  Can
> > someone tell me if this would speed up the document read or if this is
> > something only interesting for saving space.
>
> I have not tried the compression yet, but in my experience a good way
> to reduce the costs of document reads from a disk is by reading them
> in document number order whenever possible. In this way one saves
> on the disk head seeks.
> Compression should actually help reducing the costs of disk head seeks
> even more.
>
> OG: Does this really help in a multi-user environment where there are
> multiple parallel queries hitting the index and reading data from all over
> the index and the disk?  They will all share the same disk head, so the head
> will still have to jump around to service all these requests, even if each
> request is being careful to read documents in docId order, no?
>
> Otis
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message