Return-Path: Delivered-To: apmail-hadoop-common-dev-archive@www.apache.org Received: (qmail 9290 invoked from network); 13 Oct 2009 19:00:41 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 13 Oct 2009 19:00:41 -0000 Received: (qmail 7978 invoked by uid 500); 13 Oct 2009 19:00:40 -0000 Delivered-To: apmail-hadoop-common-dev-archive@hadoop.apache.org Received: (qmail 7912 invoked by uid 500); 13 Oct 2009 19:00:40 -0000 Mailing-List: contact common-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-dev@hadoop.apache.org Delivered-To: mailing list common-dev@hadoop.apache.org Received: (qmail 7832 invoked by uid 99); 13 Oct 2009 19:00:39 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Oct 2009 19:00:39 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of dachuy@gmail.com designates 209.85.220.222 as permitted sender) Received: from [209.85.220.222] (HELO mail-fx0-f222.google.com) (209.85.220.222) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Oct 2009 19:00:28 +0000 Received: by fxm22 with SMTP id 22so9419419fxm.36 for ; Tue, 13 Oct 2009 12:00:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=YQnhPlAXAmKMspAkVaT0T4VIrXq1+mLKKbddEQI6jys=; b=rq7PIwmyKJUTJGt2Tyq3yzh6JREOLKG885FkGQ7jP1ndLuBW5VEFab/U2/Jlbu7aB1 mZzljL0j2B2x6c6+s3cxuNjiJ6MMp3o+5XjycaIUNl+8Cn9paOrQlrKTtDlAnEnL7IFU QaCeje9vFTYhs0nIyb/9GtzI1iEhjNCXBhwYU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; b=m2i8fHeVqgX9mMd4VQuxeJUcO/L3ciiH/10YDJQ1FZ+ilxdiUHNrcC1/nY4o5gdc2Y bXDiYNOlq5ooaqgXaxUZNmpvJCXHIHo/4v8ObeShZLKkfx38sRHh8F/smU686rJbspgQ 6vTqeLWpn9aTOv/aq0RrNZJty9yKY8Ngj61T8= Received: by 10.103.84.32 with SMTP id m32mr3225181mul.33.1255460408767; Tue, 13 Oct 2009 12:00:08 -0700 (PDT) Received: from ?192.168.1.100? ([58.186.247.10]) by mx.google.com with ESMTPS id s10sm313519mue.22.2009.10.13.12.00.04 (version=SSLv3 cipher=RC4-MD5); Tue, 13 Oct 2009 12:00:06 -0700 (PDT) Message-ID: <4AD4CE3C.3010208@gmail.com> Date: Wed, 14 Oct 2009 02:00:12 +0700 From: Huy Phan User-Agent: Thunderbird 2.0.0.23 (X11/20090817) MIME-Version: 1.0 To: common-dev@hadoop.apache.org Subject: Re: libhdfs with FileSystem cache issue can causes to memory leak ? References: <49d4a370910122052m2e250bb7i3e107a610e79c32c@mail.gmail.com> <4AD4BB78.7080403@gmail.com> <3C0CE5DE-6F1A-4534-9979-7FF2255EC81F@cse.unl.edu> In-Reply-To: <3C0CE5DE-6F1A-4534-9979-7FF2255EC81F@cse.unl.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Hi Brian, Thank you for posting your solution here, I will try this on my testing server and do some load tests. Also thank you for pointing out some leaks inside libhdfs. Actually I'm writing a Python extension for HDFS and noticed some Memory Leaks, but I was not sure if it's the bug of my extension or somewhere else. Regards, Huy Phan Brian Bockelman wrote: > Hey Huy, > > Heres what we do: > > 1) include hdfsJniHelper.h > 2) Do the following when you're done with the filesystem: > > if (NULL != fs) { > //Get the JNIEnv* corresponding to current thread > JNIEnv* env = getJNIEnv(); > > if (env == NULL) { > ret = -EIO; > } else { > > //Parameters > jobject jFS = (jobject)fs; > > //Release unnecessary references > (*env)->DeleteGlobalRef(env, jFS); > } > } > > I also recommend the below patch to remove a few other leaks. This > saves about .5KB / file open in leaked memory. > > Index: src/c++/libhdfs/hdfs.c > =================================================================== > --- src/c++/libhdfs/hdfs.c (revision 806186) > +++ src/c++/libhdfs/hdfs.c (working copy) > @@ -248,6 +249,7 @@ > destroyLocalReference(env, jUserString); > destroyLocalReference(env, jGroups); > destroyLocalReference(env, jUgi); > + destroyLocalReference(env, jAttrString); > } > #else > > Index: src/c++/libhdfs/hdfsJniHelper.c > =================================================================== > --- src/c++/libhdfs/hdfsJniHelper.c (revision 806186) > +++ src/c++/libhdfs/hdfsJniHelper.c (working copy) > @@ -239,6 +241,7 @@ > fprintf(stderr, "ERROR: jelem == NULL\n"); > } > (*env)->SetObjectArrayElement(env, result, i, jelem); > + (*env)->DeleteLocalRef(env, jelem); > } > return result; > } > > > Of course, this is not an official solution, not supported, may > explode, etc. > > Brian > > On Oct 13, 2009, at 12:40 PM, Huy Phan wrote: > >> Hi Eli, >> You're right that the problem is resolved in 0.20 with function >> newInstance(), unfortunately my system's running on Hadoop 0.18.3 and >> i'm still looking for a way to patch this version without affecting >> the current system. >> >> Regards, >> Huy Phan >> >> Eli Collins wrote: >>> Hey Huy, >>> >>> What version of hadoop are you using? I think HADOOP-4655 may have >>> resolved the issue you're seeing but I think is only in 20 and later. >>> >>> Thanks, >>> Eli >>> >>> On Mon, Oct 12, 2009 at 8:52 PM, Huy Phan wrote: >>> >>>> Hi All, >>>> I'm writing a multi-thread application using libhdfs in C, a >>>> known issue >>>> of HDFS is that the FileSystem API caches FileSystem handles and >>>> always >>>> returned the same FileSystem handle when called from different >>>> threads. It >>>> means even though I called hdfsConnect for many times, I should not >>>> call >>>> hdfsDisconnect in any single thread. >>>> This may lead to memory leak on system, do you know any workaround >>>> for this >>>> issue ? >>>> >>>> >>> >>> >