lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Krupansky <jack.krupan...@gmail.com>
Subject Re: While idexing millions of data Getting error
Date Fri, 18 Dec 2015 15:51:03 GMT
Deep in that stack trace: "Suppressed: java.io.IOException: No space left
on device". Out of disk, apparently. Seems unlikely for the big disks on
most systems these days. Are you using SSD? They can be relatively small,
especially if on a box that has been virtualized into multiple VMs.

Some discussion of the initial HTTP error here:
http://stackoverflow.com/questions/29527803/eliminating-or-understanding-jetty-9s-illegalstateexception-too-much-data-aft

But maybe Solr/Lucene are behaving in some extreme manner when out of disk
space, cascading to the actual error you got at the client.

How many documents do you send at a time? How often do you commit?
Generally, you should send batches of documents, like 1,000 at a time.
Maybe commit every 50,000 documents.

2 million documents is nothing for Solr. I recommend 100 million per
node/shard as a rough practical limit although the exact practical limit
depends on your particular hardware and your particular data model and the
data itself.

How large is each document, roughly? Hundreds, thousands, or millions of
bytes? Are some documents extremely large?


-- Jack Krupansky

On Fri, Dec 18, 2015 at 10:30 AM, Toke Eskildsen <te@statsbiblioteket.dk>
wrote:

> Mugeesh Husain <mugeesh@gmail.com> wrote:
> > could you tell me the maximum number of limit for posting data to solr.
>
> The data size can be at most 2GB, possibly minus a few bytes. It is due to
> the HttpUrlComponent used inside of Solr, which only accepts a signed
> integer as size.
>
> As for the number of documents, the limit is 2 billion. It does not seem
> to be a problem in your case.
>
>
> - Toke Eskildsen
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message