lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rajendran, Prabaharan" <Rajendra...@DNB.com>
Subject RE: SimplePostTool: FATAL: IOException while posting data: java.io.IOException: too many bytes written
Date Tue, 28 Jun 2016 16:32:47 GMT
Thanks Erick, for your response. Now I am splitting the file before indexing.

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: 28 June 2016 11:01
To: solr-user
Subject: Re: SimplePostTool: FATAL: IOException while posting data: java.io.IOException: too
many bytes written

You're most likely not getting _near_ 4.2G written to Solr, the transport protocol is probably
cutting that off as indicated by the "early EOF" exception.

It's really hard to justify trying to index 4.2G as a _single_ file.
First of all you won't even be able to receive it in Solr after you've given it only 1G of
memory even if you get the transport stuff worked out. Second, searching it is totally useless
in most cases as it will probably match _everything_.
Thirdly, even if it does match something, how are you going to return it to a user?

If it's multiple documents in a huge uber-doc you can break it up at ingestion and only send
docs to Solr rather than the whole thing.

IOW, I think this is a waste of your time. I understand that you're trying to see the limits,
but this limit is not a reasonable one to hope to cross.

Best,
Erick

On Mon, Jun 27, 2016 at 6:24 AM, Rajendran, Prabaharan <RajendranPr@dnb.com> wrote:
> Hi,
>
> I am trying to index a text file about 4.2 GB in size. This kind of POC to understand
Solr capacity on indexing & searching.
>
> Here is my Solr configuration
> -Xms1024m        -Xmx1024m        -Xss256k
>
> java -Dtype=text/csv -Dparams="separator=%09" 
> -Durl=http://localhost:8983/solr/mycollection/update -jar 
> ..\example\exampledocs\post.jar ..\example\exampledocs\largefile.txt
>
> While doing index got error like below,
> SimplePostTool: FATAL: IOException while posting data: 
> java.io.IOException: too many bytes written
>
> Kindly let me know, if I need to change (increase memory) any solr configuration to handle
this.
>
> Here is my log file entry,
>
> ERROR (qtp297811323-14) [   x:collection2] o.a.s.c.SolrCore org.apache.solr.common.SolrException:
CSVLoader: input=null, line=2815040,can't read line: 2815040
>                 values={NO LINES AVAILABLE}
>                 at org.apache.solr.handler.loader.CSVLoaderBase.input_err(CSVLoaderBase.java:317)
>                 at org.apache.solr.handler.loader.CSVLoaderBase.load(CSVLoaderBase.java:356)
>                 at org.apache.solr.handler.loader.CSVLoader.load(CSVLoader.java:31)
>                 at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:98)
>                 at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>                 at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
>                 at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068)
>                 at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:669)
>                 at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:462)
>                 at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:214)
>                 at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:179)
>                 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
>                 at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
>                 at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>                 at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
>                 at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
>                 at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
>                 at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
>                 at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>                 at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
>                 at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>                 at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
>                 at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
>                 at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
>                 at org.eclipse.jetty.server.Server.handle(Server.java:499)
>                 at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
>                 at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
>                 at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
>                 at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
>                 at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
>                 at java.lang.Thread.run(Thread.java:745)
> Caused by: org.eclipse.jetty.io.EofException: Early EOF
>                 at org.eclipse.jetty.server.HttpInput$3.noContent(HttpInput.java:506)
>                 at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:124)
>                 at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
>                 at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
>                 at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
>                 at java.io.InputStreamReader.read(InputStreamReader.java:184)
>                 at java.io.BufferedReader.fill(BufferedReader.java:154)
>                 at java.io.BufferedReader.read(BufferedReader.java:175)
>                 at org.apache.solr.internal.csv.ExtendedBufferedReader.read(ExtendedBufferedReader.java:82)
>                 at org.apache.solr.internal.csv.CSVParser.simpleTokenLexer(CSVParser.java:421)
>                 at org.apache.solr.internal.csv.CSVParser.nextToken(CSVParser.java:371)
>                 at org.apache.solr.internal.csv.CSVParser.getLine(CSVParser.java:231)
>                 at org.apache.solr.handler.loader.CSVLoaderBase.load(CSVLoaderBase.java:353)
>                 ... 29 more
>
> Thanks,
> Prabaharan
Mime
View raw message