lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: reindex all files
Date Thu, 02 Nov 2006 13:14:50 GMT
There isn't enough of an explanation here to give you any meaningful answer
about performance. Do you have any evidence that the optimization will take
a long time? Have you run it before? Do you have any count of how many files
you're indexing? Do they have a complex structure? What are the performance
numbers you need? What performance are you seeing? The only real advice here
is "try some stuff and see". That said, here are some parameters that will
affect your indexing speed, see the javadocs for cautions about how to use
these.

setMergeFactor
setMaxBufferedDocs
setMaxMergeDocs

I'd advise you just try a few runs with the defaults and see whether
performance needs tweaking....

Here's some code for recursing a directory structure


    private void index(File file) throws Exception {
        // Boring stuff to only try to index files.
        if (file.canRead()) {
            if (file.isDirectory()) {
                String[] files = file.list();
                if (files != null) {
                    for (int i = 0; i < files.length; i++) {
                        index(new File(file, files[i]));
                    }
                }
            } else {
                // Index the file
                System.out.println("Indexing " + file);
                // Here's the interesting part.
                indexOneFile(file); // Here's where you'll do your
processing....
            }
        }
    }


Call this with a new File("path to directory");
On 11/2/06, spinergywmy <spinergywmy@gmail.com> wrote:
>
>
> Hi,
>
>    Basically what I want to do is to reindex all the files in particular
> directory. I wonder how I can do this with Lucene search? And I noticed
> that
> reindexing all the files will actually take a lot of time and resources,
> so
> how I can optimize the preformance problem.
>
>    Hope there are examples that I can review on.
>
>    Thanks.
>
> regards,
> Wooi Meng
> --
> View this message in context:
> http://www.nabble.com/reindex-all-files-tf2558188.html#a7128924
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message