Return-Path: Delivered-To: apmail-hadoop-general-archive@minotaur.apache.org Received: (qmail 39215 invoked from network); 7 Jan 2011 22:11:53 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 7 Jan 2011 22:11:53 -0000 Received: (qmail 57053 invoked by uid 500); 7 Jan 2011 22:11:51 -0000 Delivered-To: apmail-hadoop-general-archive@hadoop.apache.org Received: (qmail 56964 invoked by uid 500); 7 Jan 2011 22:11:51 -0000 Mailing-List: contact general-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@hadoop.apache.org Delivered-To: mailing list general@hadoop.apache.org Received: (qmail 56956 invoked by uid 99); 7 Jan 2011 22:11:51 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Jan 2011 22:11:51 +0000 X-ASF-Spam-Status: No, hits=4.7 required=10.0 tests=FREEMAIL_FROM,FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ndaley@mac.com designates 17.148.16.103 as permitted sender) Received: from [17.148.16.103] (HELO asmtpout028.mac.com) (17.148.16.103) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Jan 2011 22:11:45 +0000 MIME-version: 1.0 Content-type: multipart/alternative; boundary="Boundary_(ID_2EODpMQZAMxQO8NYrj3T/Q)" Received: from [10.0.1.13] ([71.198.192.174]) by asmtp028.mac.com (Oracle Communications Messaging Exchange Server 7u4-20.01 64bit (built Nov 21 2010)) with ESMTPSA id <0LEO00J66AXYYZ80@asmtp028.mac.com> for general@hadoop.apache.org; Fri, 07 Jan 2011 14:11:10 -0800 (PST) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.2.15,1.0.148,0.0.0000 definitions=2011-01-07_11:2011-01-07,2011-01-07,1970-01-01 signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 ipscore=0 suspectscore=1 phishscore=0 bulkscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx engine=6.0.2-1010190000 definitions=main-1101070089 Sun-Java-System-SMTP-Warning: Lines longer than SMTP allows found and wrapped. From: Nigel Daley Subject: Re: Patch testing Date: Fri, 07 Jan 2011 14:11:08 -0800 In-reply-to: To: general@hadoop.apache.org References: <20101020195420.GG2075@tp> <53F363B9-E865-4E63-907A-7F341A246235@yahoo-inc.com> <4D646D78-621B-4C50-9420-6B5EC7F49B54@mac.com> <7B1CE23C-E15A-4BA5-8D96-62163A56E23C@mac.com> <4CE46119.2030509@yahoo-inc.com> <8617BECB-78B4-42B6-B592-D7FC1F8DA923@mac.com> <4CE47C88.5050203@yahoo-inc.com> <41FB0800-3703-49C1-8069-DEB74FFE6FAC@mac.com> <7B140F50-27E0-47C7-8EF8-B897D26CEE49@mac.com> <394BBDCC-9561-4E07-8EBA-AE3A92814E5A@mac.com> Message-id: X-Mailer: Apple Mail (2.1082) --Boundary_(ID_2EODpMQZAMxQO8NYrj3T/Q) Content-type: text/plain; CHARSET=US-ASCII Content-transfer-encoding: 7BIT Hrm, the MR precommit test I'm running has hung (been running for 14 hours so far). FWIW, 2 HDFS precommit tests are hung too. I suspect it could be the NFS mounts on the machines. I forced a thread dump which you can see in the console: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/10/console Any other ideas why these might be hanging? Thanks, Nige On Jan 5, 2011, at 5:42 PM, Todd Lipcon wrote: > On Wed, Jan 5, 2011 at 4:39 PM, Nigel Daley wrote: > >> Thanks for looking into it Todd. Let's first see if you think it can be >> fixed quickly. Let me know. >> >> > No problem, it wasn't too bad after all. Patch up on HADOOP-7087 which fixes > this test timeout for me. > > -Todd > > >> On Jan 5, 2011, at 4:33 PM, Todd Lipcon wrote: >> >>> On Wed, Jan 5, 2011 at 4:19 PM, Nigel Daley wrote: >>> >>>> Todd, would love to get >>>> https://issues.apache.org/jira/browse/MAPREDUCE-2121 fixed first since >>>> this is failing every night on trunk. >>>> >>> >>> What if we disable that test, move that issue to 0.22 blocker, and then >>> enable the test-patch? I'll also look into that one today, but if it's >>> something that will take a while to fix, I don't think we should hold off >>> the useful testing for all the other patches. >>> >>> -Todd >>> >>> On Jan 5, 2011, at 2:45 PM, Todd Lipcon wrote: >>>> >>>>> Hi Nigel, >>>>> >>>>> MAPREDUCE-2172 has been fixed for a while. Are there any other >> particular >>>>> JIRAs you think need to be fixed before the MR test-patch queue gets >>>>> enabled? I have a lot of outstanding patches and doing all the >> test-patch >>>>> turnaround manually on 3 different boxes is a real headache. >>>>> >>>>> Thanks >>>>> -Todd >>>>> >>>>> On Tue, Dec 21, 2010 at 1:33 PM, Nigel Daley wrote: >>>>> >>>>>> Ok, HDFS is now enabled. You'll see a stream of updates shortly on >> the >>>> ~30 >>>>>> Patch Available HDFS issues. >>>>>> >>>>>> Nige >>>>>> >>>>>> On Dec 20, 2010, at 12:42 PM, Jakob Homan wrote: >>>>>> >>>>>>> I committed HDFS-1511 this morning. We should be good to go. I can >>>>>>> haz snooty robot butler? >>>>>>> >>>>>>> On Fri, Dec 17, 2010 at 8:31 PM, Konstantin Boudnik >>>>>> wrote: >>>>>>>> Thanks Jacob. I am wasted already but I can do it on Sun, I think, >>>>>>>> unless it is done earlier. >>>>>>>> -- >>>>>>>> Take care, >>>>>>>> Konstantin (Cos) Boudnik >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Dec 17, 2010 at 19:41, Jakob Homan >> wrote: >>>>>>>>> Ok. I'll get a patch out for 1511 tomorrow, unless someone wants >> to >>>>>>>>> whip one up tonight. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Dec 17, 2010 at 7:22 PM, Nigel Daley >> wrote: >>>>>>>>>> I agree with Cos on fixing HDFS-1511 first. Once that is done I'll >>>>>> enable hdfs patch testing. >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> Nige >>>>>>>>>> >>>>>>>>>> Sent from my iPhone4 >>>>>>>>>> >>>>>>>>>> On Dec 17, 2010, at 7:01 PM, Konstantin Boudnik >>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> One more issue needs to be addressed before test-patch is turned >> on >>>>>> HDFS is >>>>>>>>>>> https://issues.apache.org/jira/browse/HDFS-1511 >>>>>>>>>>> -- >>>>>>>>>>> Take care, >>>>>>>>>>> Konstantin (Cos) Boudnik >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, Dec 17, 2010 at 16:17, Konstantin Boudnik < >> cos@apache.org> >>>>>> wrote: >>>>>>>>>>>> Considering that because of these 4 faulty cases every patch >> will >>>> be >>>>>>>>>>>> -1'ed a patch author will still have to look at it and make a >>>>>> comment >>>>>>>>>>>> why this particular -1 isn't valid. Lesser work, perhaps, but >>>>>> messier >>>>>>>>>>>> IMO. I'm not blocking it - I just feel like there's a better >> way. >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Take care, >>>>>>>>>>>> Konstantin (Cos) Boudnik >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Dec 17, 2010 at 15:55, Jakob Homan >>>>>> wrote: >>>>>>>>>>>>>> If HDFS is added to the test-patch queue right now we get >>>>>>>>>>>>>> nothing but dozens of -1'ed patches. >>>>>>>>>>>>> There aren't dozens of patches being submitted currently. The >> -1 >>>>>>>>>>>>> isn't the important thing, it's the grunt work of actually >>>> running >>>>>>>>>>>>> (and waiting) for the tests, test-patch, etc. that Hudson does >> so >>>>>> that >>>>>>>>>>>>> the developer doesn't have to. >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Dec 17, 2010 at 3:48 PM, Dhruba Borthakur < >>>>>> dhruba@gmail.com> wrote: >>>>>>>>>>>>>> +1, thanks for doing this. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Dec 17, 2010 at 3:19 PM, Jakob Homan < >> jghoman@gmail.com >>>>> >>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> So, with test-patch updated to show the failing tests, saving >>>> the >>>>>>>>>>>>>>> developers the need to go and verify that the failed tests >> are >>>>>> all >>>>>>>>>>>>>>> known, how do people feel about turning on test-patch again >> for >>>>>> HDFS >>>>>>>>>>>>>>> and mapred? I think it'll help prevent any more tests from >>>>>> entering >>>>>>>>>>>>>>> the "yeah, we know" category. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> jg >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Wed, Nov 17, 2010 at 5:08 PM, Jakob Homan < >>>>>> jhoman@yahoo-inc.com> wrote: >>>>>>>>>>>>>>>> True, each patch would get a -1 and the failing tests would >>>> need >>>>>> to be >>>>>>>>>>>>>>>> verified as those known bad (BTW, it would be great if >> Hudson >>>>>> could list >>>>>>>>>>>>>>>> which tests failed in the message it posts to JIRA). But >>>> that's >>>>>> still >>>>>>>>>>>>>>> quite >>>>>>>>>>>>>>>> a bit less error-prone work than if the developer runs the >>>> tests >>>>>> and >>>>>>>>>>>>>>>> test-patch themselves. Also, with 22 being cut, there are a >>>> lot >>>>>> of >>>>>>>>>>>>>>> patches >>>>>>>>>>>>>>>> up in the air and several developers are juggling multiple >>>>>> patches. The >>>>>>>>>>>>>>>> more automation we can have, even if it's not perfect, will >>>>>> decrease >>>>>>>>>>>>>>> errors >>>>>>>>>>>>>>>> we may make. >>>>>>>>>>>>>>>> -jg >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Nigel Daley wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Nov 17, 2010, at 3:11 PM, Jakob Homan wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> It's also ready to run on MapReduce and HDFS but we won't >>>>>> turn it on >>>>>>>>>>>>>>>>>>> until these projects build and test cleanly. Looks like >>>> both >>>>>> these >>>>>>>>>>>>>>> projects >>>>>>>>>>>>>>>>>>> currently have test failures. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Assuming the projects are compiling and building, is there >> a >>>>>> reason to >>>>>>>>>>>>>>>>>> not turn it on despite the test failures? Hudson is >>>> invaluable >>>>>> to >>>>>>>>>>>>>>> developers >>>>>>>>>>>>>>>>>> who then don't have to run the tests and test-patch >>>>>> themselves. We >>>>>>>>>>>>>>> didn't >>>>>>>>>>>>>>>>>> turn Hudson off when it was working previously and there >>>> were >>>>>> known >>>>>>>>>>>>>>>>>> failures. I think one of the reasons we have more failing >>>>>> tests now is >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>> higher cost of doing Hudson's work (not a great excuse I >>>>>> know). This >>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>> particularly true now because several of the failing tests >>>>>> involve >>>>>>>>>>>>>>> tests >>>>>>>>>>>>>>>>>> timing out, making the whole testing regime even longer. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Every single patch would get a -1 and need investigation. >>>>>> Currently, >>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>> would be about 83 investigations between MR and HDFS issues >>>>>> that are in >>>>>>>>>>>>>>>>> patch available state. Shouldn't we focus on getting these >>>>>> tests fixed >>>>>>>>>>>>>>> or >>>>>>>>>>>>>>>>> removed/? Also, I need to get MAPREDUCE-2172 fixed >> (applies >>>> to >>>>>> HDFS as >>>>>>>>>>>>>>>>> well) before I turn this on. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>> Nige >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Connect to me at http://www.facebook.com/dhruba >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Todd Lipcon >>>>> Software Engineer, Cloudera >>>> >>>> >>> >>> >>> -- >>> Todd Lipcon >>> Software Engineer, Cloudera >> >> > > > -- > Todd Lipcon > Software Engineer, Cloudera --Boundary_(ID_2EODpMQZAMxQO8NYrj3T/Q)--