lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Greg Shackles" <gshack...@gmail.com>
Subject Re: Lucene implementation/performance question
Date Fri, 28 Nov 2008 01:02:27 GMT
The queries I'm doing really aren't anything clever...just searching for
phrases on pages of text, sometimes narrowing results by other words that
must appear on the page, or words that cannot appear on the same page.  I
don't have experience with those span queries so i can't say much about
them.  However, I will say that at present there seems to be no way to make
the PayloadSpanUtil act on just a subset of an index.  I tried just taking
matching documents and putting them into a RAM backed index, but that
doesn't transfer over the payloads so it was pretty much useless for me.  I
hope this is something they can work out in the future.  In the payloads I
store a lot of metadata, including the word as it actually appeared on the
page, with capitalization, punctuation, etc.

I don't think it's really feasible to search on the payload since that isn't
indexed.  If you have things in there that you would want indexed, I would
suggest designing your indexes differently to accomodate for that.

I'm using Lucene 2.4, plus the patch that Mark put out to fix the payload
issues I ran into.  I wouldn't suggest using anything less since it wasn't
very useful before the patch.  It would probably be worth your time just to
upgrade the version of Lucene you are using anyway, for a variety of
reasons.

- Greg

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message