lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Willnauer <simon.willna...@googlemail.com>
Subject Re: Repeat to the right list: Solr spewage and possible re-entrancy problem?
Date Mon, 14 Jun 2010 12:46:55 GMT
Hey Karl,

the TIME_WAIT states you see are ok from the TCP perspective. The end
that sends the first FIN goes into the TIME_WAIT state, because that
is the end that sends the final ACK. If the other end's FIN is lost,
or if the final ACK is lost, having the end that sends the first FIN
maintain state about the connection guarantees that it has enough
information to retransmit the final ACK.
The socket will stay in TIME_WAIT for 2*packet lifetime (2* because of
the roundtrip).

As long as SO_LINGER is enabled the close operation on a socket will
wait until all queued messages are send. See this:

“When enabled, a close(2) or shutdown(2) will not return until all
queued messages for the socket have been successfully sent or the
linger timeout has been reached. Otherwise, the call returns
immediately and the closing is done in the background. When the socket
is closed as part of exit(2), it always lingers in the background.”

By defautl I think this is enabled and in the tomcat case set to 25 seconds.

I am not sure if that helps you with your problem but you could try
setting it to a lower value or disable it completely.

simon

On Mon, Jun 14, 2010 at 2:20 PM,  <karl.wright@nokia.com> wrote:
> Good catch!
>
> root@duck6:~# netstat -an | fgrep :8983 | wc
>  28223  169338 2257840
> root@duck6:~#
>
> ... and here's an example:
>
> tcp6       0      0 127.0.0.1:8983          127.0.0.1:44058         TIME_WAIT
>
> So, once again, what causes this behavior?  How can I wind up with 28,000 socket connections
hanging around, if both my client and Solr are behaving properly and are closing connections
properly?
>
> (I suspect that the answer to my somewhat rhetorical question is, "this should not happen".
 But then the question becomes, "why IS it happening?")
>
> Karl
>
> -----Original Message-----
> From: Wright Karl (Nokia-S/Cambridge)
> Sent: Sunday, June 13, 2010 7:52 AM
> To: dev@lucene.apache.org
> Subject: RE: Repeat to the right list: Solr spewage and possible re-entrancy problem?
>
> Good idea.
>
> How would you prevent such a thing from occurring on the server?  Or would this be the
result of the client not doing something properly?
>
> Karl
>
> ________________________________________
> From: ext Lance Norskog [goksron@gmail.com]
> Sent: Saturday, June 12, 2010 11:55 PM
> To: dev@lucene.apache.org
> Subject: Re: Repeat to the right list: Solr spewage and possible re-entrancy    problem?
>
> There are situations where zombie sockets pile up at the server and
> keep zombie threads open. When this happens, check the total number of
> threads in the server JVM, and the total number of open or TIME_WAIT
> sockets. 'netstat -an | fgrep :8983' may find 2000 entries.
>
> Lance
>
> On Mon, Jun 7, 2010 at 7:35 AM,  <karl.wright@nokia.com> wrote:
>> Hi folks,
>>
>> This morning I was experimenting with using multiple threads while indexing
>> some 20,000,000 records worth of content.  In fact, my test spun up some 50
>> threads, and happily chugged away for a couple of hours before I saw the
>> following output from my test code:
>>
>>>>>>>>
>> Http protocol error: HTTP/1.1 400 missing_content_stream, while trying to
>> index record 6469124
>> Http protocol error: HTTP/1.1 400 missing_content_stream, while trying to
>> index record 6469551
>> Http protocol error: HTTP/1.1 400 missing_content_stream, while trying to
>> index record 6470592
>> Http protocol error: HTTP/1.1 400 missing_content_stream, while trying to
>> index record 6472454
>> java.net.SocketException: Connection reset
>>         at java.net.SocketInputStream.read(SocketInputStream.java:168)
>>         at HttpPoster.getResponse(HttpPoster.java:280)
>>         at HttpPoster.indexPost(HttpPoster.java:191)
>>         at ParseAndLoad$PostThread.run(ParseAndLoad.java:638)
>> <<<<<<
>>
>> Looking at the solr-side output, I see nothing interesting at all:
>>
>>>>>>>>
>> Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute
>> INFO: [] webapp=/solr path=/update/extract
>> params={literal.nokia_longitude=9.78518981933594&literal.nokia_phone=%2B497971910474&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_district=Münster&literal.nokia_placerating=0&literal.id=6472724&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=1&literal.nokia_ppid=276u0wyw-c8cb7f4d6cd84a639a4e7d3570bf8814&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9985514322917&literal.nokia_postalcode=74405&literal.nokia_street=WeinhaldenstraÃe&literal.nokia_title=Dorfgemeinschaft+Münster+e.V.&literal.nokia_category=261}
>> status=0 QTime=1
>> Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute
>> INFO: [] webapp=/solr path=/update/extract
>> params={literal.nokia_longitude=9.76717020670573&literal.nokia_phone=%2B497971950725&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_placerating=0&literal.id=6472737&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=13&literal.nokia_ppid=276u0wyw-d3bed6449fcb41b0adc50ae08e041f8d&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9974405924479&literal.nokia_fax=%2B497971950712&literal.nokia_postalcode=74405&literal.nokia_street=KochstraÃe&literal.nokia_title=BayWa+AG+Bau-+%26+Gartenmarkt&literal.nokia_category=194}
>> status=0 QTime=0
>> Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute
>> INFO: [] webapp=/solr path=/update/extract
>> params={literal.nokia_longitude=9.77591044108073&literal.nokia_phone=%2B49797124009&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_district=Unterrot&literal.nokia_placerating=0&literal.id=6472739&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=28&literal.nokia_ppid=276u0wyw-d534d7a9235a4edf878d5e32a34bad8b&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9791788736979&literal.nokia_fax=%2B49797123431&literal.nokia_postalcode=74405&literal.nokia_street=HauptstraÃe&literal.nokia_title=Gastel+R.&literal.nokia_category=5}
>> status=0 QTime=1
>> Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute
>> INFO: [] webapp=/solr path=/update/extract
>> params={literal.nokia_longitude=9.76935&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_placerating=5&literal.id=6472698&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=15&literal.nokia_ppid=276u0wyw-9544100e68d74162aff54783b9376134&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9981&literal.nokia_postalcode=74405&literal.nokia_street=KanzleistraÃe&literal.nokia_tag=Steuerberater&literal.nokia_tag=Business+%26+Service&literal.nokia_title=Consultis+GmbH&literal.nokia_category=215}
>> status=0 QTime=92
>> Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute
>> INFO: [] webapp=/solr path=/update/extract
>> params={literal.nokia_longitude=9.77173970540364&literal.nokia_phone=%2B4979713238&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_placerating=0&literal.id=6472699&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=37&literal.nokia_ppid=276u0wyw-9600016fd0d248c9b442111838350f64&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9987182617188&literal.nokia_fax=%2B497971911639&literal.nokia_postalcode=74405&literal.nokia_street=KarlstraÃe&literal.nokia_title=Videothek,+5th+avenue+Peltekis+Apostolos&literal.nokia_category=5}
>> status=0 QTime=93
>> <<<<<<
>>
>> It is unlikely (but, of course, not out of the question) that this hiccup is
>> due to some reentrancy problem in my test code.  It is much more likely to
>> be some kind of a Solr multi-threaded race condition - especially since it
>> looks like a number of requests all failed at precisely the same time.  This
>> is a Solr 1.5 build from mid-late March, FWIW.  Does anyone know of an
>> extractingUpdateRequestHandler re-entrancy bug of this kind?
>>
>> Thanks,
>> Karl
>>
>>
>>
>>
>>
>
>
>
> --
> Lance Norskog
> goksron@gmail.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message