lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <>
Subject Re: Indexing Files Month by Month
Date Thu, 12 Jun 2014 14:43:51 GMT
Partition your files into month-size folders and have DIH work on one
directory at a time....

What I'd do is move away from DIH and use SolrJ. That way
1> you can take full control over what you do
2> you can offload the heavy lifting of parsing the various files
    (I'm assuming here that you're indexing PDFs, Word docs, etc)
    to a bunch of clients.

Here's some code samples:

Or, if you really want to get wild, consider the MapReduceIndexerTool. That
requires some infrastructure though.


On Thu, Jun 12, 2014 at 7:22 AM, Venkata krishna <> wrote:
> Hi ,
> I am using lucene solr , would like to use Data import handler for to index
> files but millions of files are there to import so indexing process will
> take more time. I decided to import files month by month,so could you please
> provide an suggestion  to import files month by month basis.
> Thanks,
> Venkata Krishna Tolusuri.
> --
> View this message in context:
> Sent from the Solr - User mailing list archive at

View raw message