lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-3846) TestReplicationHandler.test always (?) takes many minutes on OS X (lion)
Date Tue, 18 Sep 2012 06:04:08 GMT

    [ https://issues.apache.org/jira/browse/SOLR-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13457625#comment-13457625
] 

Uwe Schindler commented on SOLR-3846:
-------------------------------------

+1, It is a great improvement! Thanks for testing. I also ran all tests on windows locally
and they successfully passed!

*For another issue an improvement:* _In my dreams last night_ I found a very good solution
for the problem with timeouting connections to IP addresses of "dead" servers (like we have
seen in the above 2 tests). My idea how to solve this completely predictive (means it works
on every computer, although firewall settings may delay execution,...):

We already have a security manager and policy. The idea would be to also implement checkConnect()
and checkResolve() for this custom manager, which is called on every connect. This method
checks for a set of "default dead servers" (would work with ip adresses, or fake host names):
If it gets called with such an address, it could throw IOException (like a SocketConnectException)
so make the fail predcicatable. The underlying O/S's network layer is then never called, so
no timeouts can occur. As SecurityManager cannot throw checked exceptions, I would use:

{code:java}Rethrow.rethrow(new IOException("Emulated network failure"));{code}

inside the SecurityManager.
                
> TestReplicationHandler.test always (?) takes many minutes on OS X (lion)
> ------------------------------------------------------------------------
>
>                 Key: SOLR-3846
>                 URL: https://issues.apache.org/jira/browse/SOLR-3846
>             Project: Solr
>          Issue Type: Improvement
>          Components: Build
>    Affects Versions: 4.0-BETA, 5.0
>         Environment: OS X (Lion). Apparently (see Yonik's notes) this does NOT happen
on other op systems.
> java version "1.6.0_35"
> Java(TM) SE Runtime Environment (build 1.6.0_35-b10-428-11M3811)
> Java HotSpot(TM) 64-Bit Server VM (build 20.10-b01-428, mixed mode)
> Solr trunk and 4.x from 16-Sep, but it's been happening for a couple of weeks at least.
>            Reporter: Erick Erickson
>            Assignee: Erick Erickson
>             Fix For: 4.0, 5.0
>
>         Attachments: SOLR-3846.patch, SOLR-3846.patch, SOLR-3846.patch, SOLR-3846.patch,
SOLR-3846.patch, stacks.txt
>
>
> Here's the seed was using, but this is apparently unnecessary:
> <JUnit4> says ¡Hola! Master seed: 6785BB3284A15298
> _eventually_ it seems to complete, but it takes many minutes, for instance this was reported
once, but I usually lose patience and ctrl-c out:
> {code}
> [junit4:junit4] Completed on J2 in 2449.62s, 1 test
> [junit4:junit4] 
> [junit4:junit4] JVM J0:     1.21 ..   266.67 =   265.47s
> [junit4:junit4] JVM J1:     1.21 ..   238.33 =   237.12s
> [junit4:junit4] JVM J2:     1.21 ..  2538.60 =  2537.39s
> [junit4:junit4] JVM J3:     0.97 ..   267.37 =   266.40s
> [junit4:junit4] Execution time total: 42 minutes 18 seconds
> {code}
> and a lot of lines like:
> HEARTBEAT J2: 2012-09-16T17:38:38, no events in:  187s, approx. at: TestReplicationHandler.test
> Yonik reports that he can make this happen 100% of the time on OS X/Lion, which squares
with my experience as I recall. Yonik also reports...
> On my linux box (built in '09, PhenomII, HDD) the test takes 50-55 sec.
> On my kids old windows box ('08, athlon64, HDD, Win7) the test takes 88-95 sec.
> On my mac it always takes forever, and I see loops of stuff like this:
> {code}
> SEVERE Master at: http://localhost:62803/solr is not available. Index
> fetch failed. Exception:
> org.apache.solr.client.solrj.SolrServerException: Server refused
> connection at: http://localhost:62803/solr
> [junit4:junit4]   2> 52751 T219 C17 UPDATE [collection1] webapp=/solr
> path=/update params={wt=javabin&version=2} {add=[150]} 0 0
> [junit4:junit4]   2> 52755 T219 C17 UPDATE [collection1] webapp=/solr
> path=/update params={wt=javabin&version=2} {add=[151]} 0 0
> [junit4:junit4]   2> 62758 T215 oash.SnapPuller.fetchLatestIndex
> SEVERE Master at: http://localhost:62803/solr is not available. Index
> fetch failed. Exception:
> {code}
> And I'm soooo happy it's not happening to others and just being swept under the rug,
restores my faith. I should have known better ;)
> See the discussion on the dev list labeled "being a good citizen is hard when you can't
successfully run tests" for more context.
> I don't know how much time I'll have to dive in to it but I'll certainly be happy to
test anyone's patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message