lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <karl.wri...@nokia.com>
Subject RE: Solr spewage and dropped documents, while indexing
Date Wed, 09 Jun 2010 13:01:00 GMT
FWIW, I've chased this upstream in Solr to the following code, in SolrRequestParsers.java,
which creates disk-file-based streams from the content part of individual multipart form requests:

>>>>>>
    // Create a factory for disk-based file items
    DiskFileItemFactory factory = new DiskFileItemFactory();

    // Set factory constraints
    // TODO - configure factory.setSizeThreshold(yourMaxMemorySize);
    // TODO - configure factory.setRepository(yourTempDirectory);

    // Create a new file upload handler
    ServletFileUpload upload = new ServletFileUpload(factory);
    upload.setSizeMax( uploadLimitKB*1024 );

    // Parse the request
    List items = upload.parseRequest(req);
    Iterator iter = items.iterator();
    while (iter.hasNext()) {
        FileItem item = (FileItem) iter.next();

        // If its a form field, put it in our parameter map
        if (item.isFormField()) {
          MultiMapSolrParams.addParam( 
            item.getFieldName(), 
            item.getString(), params.getMap() );
        }
        // Only add it if it actually has something...
        else if( item.getSize() > 0 ) { 
          streams.add( new FileItemContentStream( item ) );
        }
    }
<<<<<<

It seems to me that there's opportunity here for Solr to proliferate temporary disk files
until a directory limit is reached, if there is inadequate cleanup done after processing.
 I have not yet managed to either research the cleanup end, or run experiments to verify this
scenario, but if I correctly recall how DiskFileItemFactory works, it creates files and registers
them to be cleaned up on JVM exit.  If that's the only cleanup, that's not going to cut it
for a real-world system.

Karl


-----Original Message-----
From: Wright Karl (Nokia-S/Cambridge) 
Sent: Monday, June 07, 2010 6:21 PM
To: dev@lucene.apache.org
Subject: Re: Solr spewage and dropped documents, while indexing

Thanks for the confirmation.  I am not sure I buy the 'just can't keep up' explanation though.
 The load on this box during this test looks reasonable and there is no sign of thrashing
etc.  So I am inclined to believe either there's a server timeout that is pathologically short,
or it is a race condition of some kind.

Karl

--- original message ---
From: "ext Thijs Vonk" <vonk.thijs@gmail.com>
Subject: Re: Solr spewage and dropped documents, while indexing
Date: June 7, 2010
Time: 5:46:36  PM


Hi Karl,

I'm just a novice Solr user, but I've seen this error some time ago, it
usually happend when I was pushing my machine to hard (the machine that
was gathering the documents to be send to solr). I was using to many
threads on the machine, and now that I've reduced those to a more
reasonable amount I haven't seen it. It was as if the machine just
didn't have enough CPU power to actually send everything over the wire.

I read that you had about 50 threads running. Depending an what they are
doing it seems like a lot, but that all depends totally on the hardware
you're running on I guess.

Thijs

On 6/7/10 5:30 PM, karl.wright@nokia.com wrote:
> Hi Simon,
>
> The same data was successfully indexed with one thread over the weekend.  So it is unlikely
to be a data issue.
>
> It turns out there was one of the content-missing exceptions far back in the screen buffer,
but only one:
>
> Jun 7, 2010 9:57:48 AM org.apache.solr.common.SolrException log
> SEVERE: org.apache.solr.common.SolrException: missing content stream
>          at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Co
> ntentStreamHandlerBase.java:49)
>          at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandl
> erBase.java:131)
>          at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handle
> Request(RequestHandlers.java:233)
>          at org.apache.solr.core.SolrCore.execute(SolrCore.java:1321)
>          at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter
> .java:341)
>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte
> r.java:244)
>          at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(Servlet
> Handler.java:1089)
>          at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:3
> 65)
>          at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.jav
> a:216)
>          at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:1
> 81)
>          at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:7
> 12)
>          at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
>          at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHand
> lerCollection.java:211)
>          at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.
> java:114)
>          at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:1
> 39)
>          at org.mortbay.jetty.Server.handle(Server.java:285)
>          at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:50
> 2)
>          at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnectio
> n.java:835)
>          at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
>          at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
>          at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
>          at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.
> java:226)
>          at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool
> .java:442)
>
> So it seems like Solr is behaving consistently with it being convinced of missing content.
>
> I am actually wondering if the problem is occurring because so much is happening at the
same time we're seeing a socket timeout for some of the requests.  If the socket times out
and closes, then I could well imagine that Solr would toss this error.  Thoughts?
>
> Karl
>
>
> -----Original Message-----
> From: ext Simon Willnauer [mailto:simon.willnauer@googlemail.com]
> Sent: Monday, June 07, 2010 11:20 AM
> To: dev@lucene.apache.org
> Subject: Re: Solr spewage and dropped documents, while indexing
>
> Karl, the HTTP error lines are produced by your code right?! Can you
> provide what has been returned by Solr?
> If that would be related to any server side problem described above
> like no sockets or so you would not see a 400! I could also imagine
> that the documents you are sending are empty - is that something which
> could have happened?
>
> simon
>
> On Mon, Jun 7, 2010 at 5:05 PM,<karl.wright@nokia.com>  wrote:
>
>> Perhaps - although missing_content_stream seems to imply that it had at least partly
read 4 requests which later failed.  Also, wouldn't there be something in the output log which
would give us a clue as to what happened?
>>
>> Is there any post-hiccup spelunking I can reasonably do?  Or should I try to reproduce
the problem with more diagnostics on?
>>
>> Karl
>>
>>
>> -----Original Message-----
>> From: ext Bernd Fondermann [mailto:bf_jak@brainlounge.de]
>> Sent: Monday, June 07, 2010 10:54 AM
>> To: dev@lucene.apache.org
>> Subject: Re: Solr spewage and dropped documents, while indexing
>>
>> Looks like a server-side problem to me.
>> Maybe the server ran out of sockets or other resources and just replied
>> with a 400 error?
>>
>>   Bernd
>>
>> karl.wright@nokia.com wrote:
>>
>>> Hi folks,
>>>
>>> This morning I was experimenting with using multiple threads while indexing some
20,000,000 records worth of content.  In fact, my test spun up some 50 threads, and happily
chugged away for a couple of hours before I saw the following output from my test code:
>>>
>>> Http protocol error: HTTP/1.1 400 missing_content_stream, while trying to index
record 6469124
>>> Http protocol error: HTTP/1.1 400 missing_content_stream, while trying to index
record 6469551
>>> Http protocol error: HTTP/1.1 400 missing_content_stream, while trying to index
record 6470592
>>> Http protocol error: HTTP/1.1 400 missing_content_stream, while trying to index
record 6472454
>>> java.net.SocketException: Connection reset
>>>          at java.net.SocketInputStream.read(SocketInputStream.java:168)
>>>          at HttpPoster.getResponse(HttpPoster.java:280)
>>>          at HttpPoster.indexPost(HttpPoster.java:191)
>>>          at ParseAndLoad$PostThread.run(ParseAndLoad.java:638)
>>> <<<<<<
>>>
>>> Looking at the solr-side output, I see nothing interesting at all:
>>>
>>> Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute
>>> INFO: [] webapp=/solr path=/update/extract params={literal.nokia_longitude=9.78518981933594&literal.nokia_phone=%2B497971910474&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_district=Münster&literal.nokia_placerating=0&literal.id=6472724&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=1&literal.nokia_ppid=276u0wyw-c8cb7f4d6cd84a639a4e7d3570bf8814&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9985514322917&literal.nokia_postalcode=74405&literal.nokia_street=WeinhaldenstraÃe&literal.nokia_title=Dorfgemeinschaft+Münster+e.V.&literal.nokia_category=261}
status=0 QTime=1
>>> Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute
>>> INFO: [] webapp=/solr path=/update/extract params={literal.nokia_longitude=9.76717020670573&literal.nokia_phone=%2B497971950725&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_placerating=0&literal.id=6472737&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=13&literal.nokia_ppid=276u0wyw-d3bed6449fcb41b0adc50ae08e041f8d&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9974405924479&literal.nokia_fax=%2B497971950712&literal.nokia_postalcode=74405&literal.nokia_street=KochstraÃe&literal.nokia_title=BayWa+AG+Bau-+%26+Gartenmarkt&literal.nokia_category=194}
status=0 QTime=0
>>> Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute
>>> INFO: [] webapp=/solr path=/update/extract params={literal.nokia_longitude=9.77591044108073&literal.nokia_phone=%2B49797124009&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_district=Unterrot&literal.nokia_placerating=0&literal.id=6472739&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=28&literal.nokia_ppid=276u0wyw-d534d7a9235a4edf878d5e32a34bad8b&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9791788736979&literal.nokia_fax=%2B49797123431&literal.nokia_postalcode=74405&literal.nokia_street=HauptstraÃe&literal.nokia_title=Gastel+R.&literal.nokia_category=5}
status=0 QTime=1
>>> Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute
>>> INFO: [] webapp=/solr path=/update/extract params={literal.nokia_longitude=9.76935&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_placerating=5&literal.id=6472698&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=15&literal.nokia_ppid=276u0wyw-9544100e68d74162aff54783b9376134&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9981&literal.nokia_postalcode=74405&literal.nokia_street=KanzleistraÃe&literal.nokia_tag=Steuerberater&literal.nokia_tag=Business+%26+Service&literal.nokia_title=Consultis+GmbH&literal.nokia_category=215}
status=0 QTime=92
>>> Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute
>>> INFO: [] webapp=/solr path=/update/extract params={literal.nokia_longitude=9.77173970540364&literal.nokia_phone=%2B4979713238&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_placerating=0&literal.id=6472699&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=37&literal.nokia_ppid=276u0wyw-9600016fd0d248c9b442111838350f64&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9987182617188&literal.nokia_fax=%2B497971911639&literal.nokia_postalcode=74405&literal.nokia_street=KarlstraÃe&literal.nokia_title=Videothek,+5th+avenue+Peltekis+Apostolos&literal.nokia_category=5}
status=0 QTime=93
>>> <<<<<<
>>>
>>> It is unlikely (but, of course, not out of the question) that this hiccup is
due to some reentrancy problem in my test code.  It is much more likely to be some kind of
a Solr multi-threaded race condition - especially since it looks like a number of requests
all failed at precisely the same time.  This is a Solr 1.5 build from mid-late March, FWIW.
 Does anyone know of an extractingUpdateRequestHandler re-entrancy bug of this kind?
>>>
>>> Thanks,
>>> Karl
>>>
>>>
>>>
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Mime
View raw message