lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tomi NA" <>
Subject Re: combined filesystem and web search
Date Tue, 11 Jul 2006 22:51:30 GMT
On 7/11/06, Erick Erickson <> wrote:
> I can answer a few of these. If you haven't yet, you'd do yourself a favor
> to pick up the book "Lucene in Action". It's written to the 1.4 code-base,
> the examples compile but give deprecated warnings for the 1.9 code base, and
> need a few more tweaks for the 2.0 code base.

I wish people would start selling .pdf books online... :(

> Also, download a copy of Luke. It's invaluable for looking at an index to
> answer questions like "why isn't this query returning what I thought it
> should?".

Thanks for the tip, I'll be sure to try it out. Sounds like a good tool to have.

> Is there a good explanation somewhere how to set up incremental
> > indexing, rather than e.g. building the whole index over nightly?
> See Luncen in Action (LIA). The short answer is yes. In the simplest form,
> you can just add new data to a currently-existing index. You probably want

This will probably sound like a ridiculous question but...*how* would
I add new data to an existing index? Did you mean literally "add" as
in index an additional directory or website, rather than "refresh the
index of the currently indexed documents, now that they've changed"?

> to make a copy just in case your power goes out or something.

I'd imagine a search index in an intranet is probably the easiest
thing to rebuild if it comes to data loss due to a blackout: I'm now
worried about that at the moment. If the indexing becomes a multiday
effort (which I highly doubt it will), then I'll start to worry.
Still, I'd like to have the option of using multiple indices.

> Then there's IndexMergeTool which I haven't used, but looks interesting.

I haven't ran into it. Can you direct me to a document or two?

> Then there's the possibility of using a MultiSearcher, but I'll leave it to
> the experts to talk about how that combines results....

>From what I gathered from the mailing list archives, I guessed that
MultiSearcher'd come up. Where should I start?

No comments on ranking of documents with no real interlinking to help?


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message