lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Files greater than 20 MB not getting Indexed. No files generated except write.lock even after 8-9 minutes.
Date Thu, 29 Aug 2013 10:50:57 GMT
Lucene doesn't have document size limits.

There are default limits for how many tokens the highlighters will process ...

But, if you are passing each line as a separate document to Lucene,
then Lucene only sees a bunch of tiny documents, right?

Can you boil this down to a small test showing the problem?

Mike McCandless

http://blog.mikemccandless.com


On Thu, Aug 29, 2013 at 1:51 AM, Ankit Murarka
<ankit.murarka@rancoretech.com> wrote:
> Hello all,
>
> Faced with a typical issue.
> I have many files which I am indexing.
>
> Problem Faced:
> a. File having size less than 20 MB are successfully indexed and merged.
>
> b. File having size >20MB are not getting INDEXED.. No Exception is being
> thrown. Only a lock file is being created in the index directory. The
> indexing process for a single file exceeding 20 MB size continues for more
> than 8 minutes after which I have a code which merge the generated index to
> existing index.
>
> Since no index is being generated now, I get an exception during merging
> process.
>
> Why Files having size greater than 20 MB are not being indexed..??.  I am
> indexing each line of the file. Why IndexWriter is not throwing any error.
>
> Do I need to change any parameter in Lucene or tweak the Lucene settings ??
> Lucene version is 4.4.0
>
> My current deployment for Lucene is on a server running with 128 MB and 512
> MB heap.
>
> --
> Regards
>
> Ankit Murarka
>
> "What lies behind us and what lies before us are tiny matters compared with
> what lies within us"
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message