river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Firmstone <j...@zeus.net.au>
Subject Tests failing due to port in use
Date Tue, 09 Apr 2013 19:29:25 GMT
This was being discussed in two separate threads:

Greg,

Are you able to find which process is using a port?  Perhaps jvisualvm 
might help determine which test it is?  If you can narrow it down to a 
bunch of test definitions sharing a test class, by identifying the class 
on the stack, you'll find it.

Sim wrote:
> On 09-04-13 11:44, Peter Firmstone wrote:
>> I've never experienced the issue locally (I see it on Jenkins quite a
>> lot), but I suspect a stale registrar process left from another test may
>> be stopping the socket from closing.  Not that registrars are also
>> simulated for discovery tests, so it may not necessarily be Reggie.
>>
>> The code is duplicated in two places in superclasses of the tests that
>> are failing,  the method portInUse(int port) is supposed to check if the
>> ports available, but only selects from a list of LookupLocator's known
>> to have started, so doesn't actually check the port's available.
>>
>> Perhaps you can find the stale test process responsible?
>
> And other jobs get executed on the same machine at the same time. 
> Maybe these consume ports?
>
> The best way to find the problem looks to me like doing a 'lsof -i' in 
> the test as soons as a port turns out used unexpectedly.
>
> Gr. Simon
>
> -- 
> QCG, Software voor het MKB, 071-5890970, http://www.qcg.nl
> Quality Consultancy Group b.v., Leiderdorp, Kvk Den Haag: 28088397

Peter wrote:
> It's a problem I've been aware of for some time, many tests don't 
> specify a port, they just want three or four or however many 
> registrar's started and they have to specify it using a LookupLocator, 
> unfortunately you can't specify an ephemeral LookupLocator, it's not 
> allowed, so if you don't specify a particular port, you get 4160 and 
> the portInUse method is supposed to work around that.
>
> When a test finds a port in use, it updates the registrar wanted to an 
> available port.  If it thinks 4160 is available, when it isn't, boom, 
> test failed.
>
> Regards,
>
> Peter.
>
> Greg Trasuk wrote:
>> I hope you're not adding this to try to help my configuration issue,
>> because I'm pretty sure that's not the problem, and I hate to screw
>> around with the regression tests unneccessarily.
>>
>> Evidence-wise, the "lookup started != initial lookups wanted" issue
>> looks like this:
>> When I enable logging at the FINE level, I see three registrars started,
>> with unicast discovery on random ports - this is Reggie's fallback
>> position when it attempts to open its unicast port and it's already in
>> use.  However, the test that's checking for "lookup started == initial
>> lookups wanted" is testing for three separate registrars on the default
>> port.  Obviously that's an impossible proposition.  My working theory is
>> that the test is supposed to be configured to start Reggie on three
>> separate unicast ports (e.e. 7700, 7701, 7702), and the assertion is
>> checking to see if that happened correctly.  This theory is supported by
>> the existence of
>> "qa/src/com/sun/jini/test/share/reggie3_2Ports.properties", which
>> includes properties called 
>> "net.jini.core.lookup.ServiceRegistrar.port.0=7700",
>> "net.jini.core.lookup.ServiceRegistrar.port.1=7701",
>> "net.jini.core.lookup.ServiceRegistrar.port.2=7702", etc. 
>> Hence my suspicion that I don't understand the configuration mechanism
>> well enough yet.  My theory is that the test is supposed to start up
>> Reggie's on three different ports, but there is something about my test
>> environment setup that causes the configuration to be missed.  Anpother
>> possibility is that the actual test configuration got messed up at some
>> point (I did see a comment in one of the properties files about "these
>> entries are commented out because they are now included in the test
>> descriptions".
>>
>> Testing that theory is going to take me a little while to understand the
>> multi-layer test configuration mechanism.  Unfortunately I've got a full
>> morning of other work ahead of me.
>>
>> Anyhow, I don't really think changing the tests is the correct approach
>> at this point (at least in my troubleshooting process).
>>
>> Cheers,
>>
>> Greg.
>>
>>
>>
>> On Tue, 2013-04-09 at 07:05, peter_firmstone@apache.org wrote:
>>  
>>> Author: peter_firmstone
>>> Date: Tue Apr  9 11:05:32 2013
>>> New Revision: 1465969
>>>
>>> URL: http://svn.apache.org/r1465969
>>> Log:
>>> Be more thorough when checking for port availability on localhost 
>>> during testing, don't assume port free if current test isn't using it.
>>>
>>> Modified:
>>>     
>>> river/jtsk/skunk/qa_refactor/trunk/qa/src/com/sun/jini/test/share/LookupServices.java

>>>
>>>
>>> Modified: 
>>> river/jtsk/skunk/qa_refactor/trunk/qa/src/com/sun/jini/test/share/LookupServices.java

>>>
>>> URL: 
>>> http://svn.apache.org/viewvc/river/jtsk/skunk/qa_refactor/trunk/qa/src/com/sun/jini/test/share/LookupServices.java?rev=1465969&r1=1465968&r2=1465969&view=diff

>>>
>>> ==============================================================================

>>>
>>> --- 
>>> river/jtsk/skunk/qa_refactor/trunk/qa/src/com/sun/jini/test/share/LookupServices.java

>>> (original)
>>> +++ 
>>> river/jtsk/skunk/qa_refactor/trunk/qa/src/com/sun/jini/test/share/LookupServices.java

>>> Tue Apr  9 11:05:32 2013
>>> @@ -27,7 +27,15 @@ import com.sun.jini.test.spec.discoverys
>>>  import com.sun.jini.test.spec.discoveryservice.AbstractBaseTest;
>>>  import 
>>> com.sun.jini.test.spec.discoveryservice.AbstractBaseTest.DiscoveryStruct; 
>>>
>>>  import 
>>> com.sun.jini.test.spec.discoveryservice.AbstractBaseTest.RegGroupsPair;
>>> +import java.io.IOException;
>>>  import java.net.InetAddress;
>>> +import java.net.InetSocketAddress;
>>> +import java.net.ServerSocket;
>>> +import java.net.Socket;
>>> +import java.net.SocketAddress;
>>> +import java.net.SocketException;
>>> +import java.net.SocketOptions;
>>> +import java.net.SocketTimeoutException;
>>>  import java.net.UnknownHostException;
>>>  import java.rmi.RemoteException;
>>>  import java.util.ArrayList;
>>> @@ -660,12 +668,47 @@ public class LookupServices {
>>>       * @return true if port in use.       */
>>>      private boolean portInUse(int port) {
>>> +        if (port == 0) return false; // Ephemeral
>>>          for(int i=0;i<lookupsStarted.size();i++) {
>>>              LocatorGroupsPair pair = lookupsStarted.get(i);
>>>              int curPort = (pair.getLocator()).getPort();
>>>              if(port == curPort) return true;
>>>          }//end loop
>>> -        return false;
>>> +        // Open a client ephemeral socket and attempt to connect to
>>> +        // port on localhost to see if someone's listening.
>>> +        Socket sock = null;
>>> +        try {
>>> +            sock = new Socket();
>>> +            if (sock instanceof SocketOptions){
>>> +                // Socket terminates with a RST rather than a FIN, 
>>> so there's no TIME_WAIT
>>> +                try {
>>> +                    ((SocketOptions) 
>>> sock).setOption(SocketOptions.SO_LINGER, Integer.valueOf(0));
>>> +                } catch (SocketException se) {
>>> +                    // Ignore, not supported.
>>> +                    logger.log( Level.FINEST, "SocketOptions set 
>>> SO_LINGER threw an Exception", se);
>>> +                }
>>> +            }
>>> +            SocketAddress add = new InetSocketAddress(port);
>>> +            sock.connect(add, 3000); // Try to connect for up to 
>>> three seconds +            // We were able to connect to a socket 
>>> listening on localhost
>>> +            return true;
>>> +        } catch (SocketTimeoutException e){
>>> +            // There might be a stale process assume in use.
>>> +            logger.log( Level.FINEST, "Socket timed out while 
>>> trying to connect", e);
>>> +            return true;
>>> +        } catch (IOException e){
>>> +            // There was nothing listening on the socket so it's 
>>> probably free.
>>> +            // or it timed out.
>>> +            return false;
>>> +        } finally {
>>> +            if (sock != null){
>>> +                try {
>>> +                    sock.close();
>>> +                } catch (IOException ex){
>>> +                    logger.log( Level.FINEST, "Socket threw 
>>> exception while attempting to close", ex);
>>> +                }// Ignore
>>> +            }
>>> +        }
>>>      }//end portInUse
>>>           private void refreshLookupLocatorListsAt(int index){
>>>
>>>     
>>
>>
>>   
>


Mime
View raw message