lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: Files greater than 20 MB not getting Indexed. No files generated except write.lock even after 8-9 minutes.
Date Thu, 29 Aug 2013 10:50:57 GMT
Lucene doesn't have document size limits.

There are default limits for how many tokens the highlighters will process ...

But, if you are passing each line as a separate document to Lucene,
then Lucene only sees a bunch of tiny documents, right?

Can you boil this down to a small test showing the problem?

Mike McCandless

On Thu, Aug 29, 2013 at 1:51 AM, Ankit Murarka
<> wrote:
> Hello all,
> Faced with a typical issue.
> I have many files which I am indexing.
> Problem Faced:
> a. File having size less than 20 MB are successfully indexed and merged.
> b. File having size >20MB are not getting INDEXED.. No Exception is being
> thrown. Only a lock file is being created in the index directory. The
> indexing process for a single file exceeding 20 MB size continues for more
> than 8 minutes after which I have a code which merge the generated index to
> existing index.
> Since no index is being generated now, I get an exception during merging
> process.
> Why Files having size greater than 20 MB are not being indexed..??.  I am
> indexing each line of the file. Why IndexWriter is not throwing any error.
> Do I need to change any parameter in Lucene or tweak the Lucene settings ??
> Lucene version is 4.4.0
> My current deployment for Lucene is on a server running with 128 MB and 512
> MB heap.
> --
> Regards
> Ankit Murarka
> "What lies behind us and what lies before us are tiny matters compared with
> what lies within us"
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message