Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 94796 invoked from network); 22 Oct 2009 17:12:58 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 22 Oct 2009 17:12:58 -0000 Received: (qmail 52708 invoked by uid 500); 22 Oct 2009 17:12:58 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 52657 invoked by uid 500); 22 Oct 2009 17:12:58 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 52641 invoked by uid 99); 22 Oct 2009 17:12:57 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Oct 2009 17:12:57 +0000 X-ASF-Spam-Status: No, hits=-3.5 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jlist@streamy.com designates 72.34.249.3 as permitted sender) Received: from [72.34.249.3] (HELO mail.streamy.com) (72.34.249.3) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Oct 2009 17:12:55 +0000 Received: from [192.168.249.237] (static-98-112-71-211.lsanca.dsl-w.verizon.net [98.112.71.211]) (authenticated bits=0) by ns1.streamy.com (8.13.1/8.13.1) with ESMTP id n9MHCSPw003677 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 22 Oct 2009 10:12:30 -0700 Message-ID: <4AE09274.50306@streamy.com> Date: Thu, 22 Oct 2009 10:12:20 -0700 From: Jonathan Gray User-Agent: Thunderbird 2.0.0.23 (Windows/20090812) MIME-Version: 1.0 To: hbase-user@hadoop.apache.org Subject: Re: HBASE-1927 (was Re: HBase 0.20.1 scanners not closing properly (memory leak)) References: <427D9107-6040-464C-8E27-FC31F07F3A52@gmail.com> <29bed2720910210523u2c73676q4c4db31009e51491@mail.gmail.com> <7c962aed0910211455m3c49369co5c3242714ab349b@mail.gmail.com> <50D5537D-9558-4F6C-9053-842C66212612@gmail.com> In-Reply-To: <50D5537D-9558-4F6C-9053-842C66212612@gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on ns1.streamy.com X-Old-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=failed version=3.2.5 Erik, I just put up a patch with the fix you described and a unit test that replicates the behavior. Please test to confirm it works. If so, drop a note in the issue and I will commit. Thanks for finding the bug. JG Erik Rozendaal wrote: > Issue created: HBASE-1927 > > On 21 okt 2009, at 23:55, stack wrote: > >> Thanks for digging in Erik. Nice one. Would you mind making an issue of >> your findings? File it against 0.20.2 so we can roll out the fix in next >> 0.20.x release. >> St.Ack >> >> On Wed, Oct 21, 2009 at 2:25 PM, Erik Rozendaal >> wrote: >> >>> I did some more investigation into this issue since after the original >>> issue stop occuring I noticed that MemStoreScanners where still being >>> leaked >>> when scanning a store with an empty MemStore. >>> >>> The cause looks to be the KeyValueHeap constructor. It drops scanners >>> when >>> the scanner's peek() method returns null (line 58 of KeyValueHeap). >>> >>> Unfortunately some scanners (like StoreScanner and MemStoreScanner) >>> register themselves to some global list when constructed and only >>> deregister >>> on close(). >>> >>> So a quick fix may be to add an >>> >>> } else { >>> scanner.close() >>> } >>> >>> to the KeyValueHeap constructor when the scanner is not added. I'm >>> not sure >>> if this is the cleanest fix though... >>> >>> Regards, >>> Erik >>> >>> >>> On 21 okt 2009, at 14:23, Guilherme Germoglio wrote: >>> >>> Hello Erik, >>>> I think your attachments were blocked. Could you please upload them >>>> somewhere else? >>>> >>>> On Wed, Oct 21, 2009 at 8:44 AM, Erik Rozendaal >>>> wrote: >>>> >>>> Hi all, >>>>> >>>>> After some performance testing on my HBase 0.20.1 development >>>>> environment >>>>> (running in pseudo- and full-distributed mode on a single laptop) I >>>>> noticed >>>>> that scanners do not get closed properly on the region server. After >>>>> creating a heap dump with Netbeans I can see the StoreScanner >>>>> instances >>>>> are >>>>> still in the Store.changedReaderObservers collection. >>>>> >>>>> Each StoreScanner instance has the "closed" flag set to false, so it >>>>> looks >>>>> like the StoreScanner.close() method was never called. >>>>> >>>>> I double-checked my client code and counted the number of times I >>>>> create >>>>> and close a scanner, and these counts match. I also found this is >>>>> repeatable >>>>> from the hbase shell. Open the shell, scan some table, take a heap >>>>> dump >>>>> and >>>>> you'll find an unclosed StoreScanner objects in the >>>>> Store.changedReaderObservers collection. >>>>> >>>>> I've attached screenshots of the number of StoreScanner instances >>>>> (after >>>>> 30.001 scans) and the Store.changedReaderObservers collection of >>>>> one of >>>>> the >>>>> Stores (notice that the closed field's value is 0 => false). >>>>> >>>>> Ultimately the region server runs out of memory and crashes. >>>>> >>>>> Has anyone experience similar problems? >>>>> >>>>> Regards, >>>>> Erik >>>>> >>>>> >>>>> >>>>> >>>> >>>> -- >>>> Guilherme >>>> >>>> msn: guigermoglio@hotmail.com >>>> homepage: http://sites.google.com/site/germoglio/ >>>> >>> >>> >