From Patricia Shanahan <p...@acm.org>
Subject Re: Bug fixing
Date Sun, 31 Oct 2010 15:30:24 GMT
Could you set up a special Hudson build that would only run the 
Solaris-failing test on Solaris, and would use a skunk branch rather 
than the trunk?

Given that, I could debug on Solaris-Hudson, by editing in logging 
and/or printouts. I use System.err printouts when I want to ask a 
specialized question that is probably only needed for that step of that 
debug effort, and I can't use Eclipse to get the data.

It would be slower than debugging on a machine under my direct control, 
but not that much.

I did get some failures from the QA test of my TxnManagerImpl change, so 
I need to investigate, and may need to modify the fix.


Jonathan Costers wrote:
> Hi Patricia
> Great work again.
> I'm going to look into creating a separate Hudson build to run only on
> Solaris nodes today. Likewise, we'll have another one running on Ubuntu
> nodes only.
> The former should fail consistently, the latter should pass consistently.
> On first sight, the issue on Solaris seems to have to do with multicasting
> somehow, but haven't been able to spend much time to investigate.
> Best
> Jonathan
> 2010/10/31 Patricia Shanahan <pats@acm.org>
>> I've found a bug fix to com.sun.jini.mahalo.TxnManagerImpl that makes the
>> entire javaspace category pass. I've started a full QA run with the fix, but
>> that will take about a day. If anything fails as a result of the
>> TxnManagerImpl change I'll have to debug that, but until I get a failure
>> there is nothing more to do on javaspace.
>> There is arguably a problem in the txnmanager test category because it did
>> not detect the bug. Maybe I should add the txnmanager category to a couple
>> of the javaspace tests that use transactions.
>> Time pick my next bug hunt - I can start something else while the QA test
>> is running. Any opinions?
>> I could take a look at some of the skipped tests to see why they are
>> skipped. Maybe some of them would tell us about real bugs if we ran them.
>> There is also the Solaris-only bug. I can build a Solaris VirtualBox and see
>> if it reproduces. If not, I may need to learn how to run things on a Hudson
>> Solaris in order to investigate.
>> Patricia
>> On 10/20/2010 1:44 PM, Jonathan Costers wrote:
>>> Great job Patricia!
>>> My vote would go to the "javaspace" test category.
>>> Last time I ran that one (250 or so tests IIRC) I got 16 failures.
>>> If we can get these cleared up, I believe we have a solid test base in
>>> place
>>> to start validating some new developments and experiments.
>>> Looks like we are really getting some momentum here, I like it a lot.
>>> Thanks to all for your hard work.
>>> Jonathan
>>> 2010/10/20 Patricia Shanahan<pats@acm.org>
>>>  On 10/19/2010 4:08 PM, Patricia Shanahan wrote:
>>>>  I propose modifying TxnManagerImpl to make it match the interface
>>>>> declaration, and allow an abort to be retried. This may break other
>>>>> tests, if they are assuming the behavior that TxnManagerImpl
>>>>> implemented.
>>>> I'm doing a full QA test, including txnmanager. Although the test is
>>>> still
>>>> running, all of the txnmanager tests, including GetStateTest, have
>>>> passed.
>>>> Those tests are the most likely to notice the change.
>>>> If the rest of the QA test is clean when it finishes, I'll check in the
>>>> fix
>>>> and we can add txnmanager to the default category list.
>>>> Any votes on my next bug hunt? For example, are there bug reports in Jira
>>>> that really need to be fixed before the next release?
>>>> Patricia

