lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Field.Store.Compress - does it improve performance of document reads?
Date Thu, 17 May 2007 17:23:07 GMT
hmmmm. Now that I re-read your first mail, something else
suggests itself. You stated:

"We have amongst other fields one field (default) storing all searchable
fields".

Do you need to store this field at all? You can search fields that are
indexed but NOT stored. I've used something of the same technique
where I index lots of different fields in the same search field so my
queries aren't as complex, but return various stored fields to the
user for display purposes. Often these latter fields are stored but
NOT indexed.

It might also be useful if you'd post some of your relevant code
snippets, perhaps some innocent line is messing you up... Are you,
perhaps, calling get() in a HitCollector? Or iterating through
many documents with a Hits object? Or.....

Best
Erick

On 5/17/07, Andreas Guther <andreas.guther@gmail.com> wrote:
>
> I am actually using the FieldSelector and unless I did something wrong it
> did not provide me any load performance improvements which was surprising
> to
> me and disappointing at the same time.  The only difference I could see
> was
> when I returned for all fields a NO_LOAD which from my understanding is
> the
> same as skipping over the document.
>
> Right now I am looking into fragmentation problems of my huge index files.
> I am de-fragmenting the hard drive to see if this brings any read
> performance improvements.
>
> I am also wondering if the FieldCache as discussed in
> http://www.gossamer-threads.com/lists/lucene/general/28252 would help
> improve the situation.
>
> Andreas
>
> On 5/17/07, Grant Ingersoll <gsingers@apache.org> wrote:
> >
> > I haven't tried compression either.  I know there was some talk a
> > while ago about deprecating, but that hasn't happened.  The current
> > implementation yields the highest level of compression.  You might
> > find better results by compressing in your application and storing as
> > a binary field, thus giving you more control over CPU used.  This is
> > our current recommendation for dealing w/ compression.
> >
> > If you are not actually displaying that field, you should look into
> > the FieldSelector API (via IndexReader).  It allows you to lazily
> > load fields or skip them all together and can yield a pretty
> > significant savings when it comes to loading documents.
> > FieldSelector is available in 2.1.
> >
> > -Grant
> >
> > On May 17, 2007, at 4:01 AM, Paul Elschot wrote:
> >
> > > On Thursday 17 May 2007 08:10, Andreas Guther wrote:
> > >> I am currently exploring how to solve performance problems I
> > >> encounter with
> > >> Lucene document reads.
> > >>
> > >> We have amongst other fields one field (default) storing all
> > >> searchable
> > >> fields.  This field can become of considerable size since we are
> > >> indexing
> > >> documents and  store the content for display within results.
> > >>
> > >> I noticed that the read can be very expensive.  I wonder now if it
> > >> would
> > >> make sense to add this field as Field.Store.Compress to the
> > >> index.  Can
> > >> someone tell me if this would speed up the document read or if
> > >> this is
> > >> something only interesting for saving space.
> > >
> > > I have not tried the compression yet, but in my experience a good way
> > > to reduce the costs of document reads from a disk is by reading them
> > > in document number order whenever possible. In this way one saves
> > > on the disk head seeks.
> > > Compression should actually help reducing the costs of disk head seeks
> > > even more.
> > >
> > > Regards,
> > > Paul Elschot
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> >
> > --------------------------
> > Grant Ingersoll
> > Center for Natural Language Processing
> > http://www.cnlp.org/tech/lucene.asp
> >
> > Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/
> > LuceneFAQ
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message