Hey Huy,
Heres what we do:
1) include hdfsJniHelper.h
2) Do the following when you're done with the filesystem:
if (NULL != fs) {
//Get the JNIEnv* corresponding to current thread
JNIEnv* env = getJNIEnv();
if (env == NULL) {
ret = -EIO;
} else {
//Parameters
jobject jFS = (jobject)fs;
//Release unnecessary references
(*env)->DeleteGlobalRef(env, jFS);
}
}
I also recommend the below patch to remove a few other leaks. This
saves about .5KB / file open in leaked memory.
Index: src/c++/libhdfs/hdfs.c
===================================================================
--- src/c++/libhdfs/hdfs.c (revision 806186)
+++ src/c++/libhdfs/hdfs.c (working copy)
@@ -248,6 +249,7 @@
destroyLocalReference(env, jUserString);
destroyLocalReference(env, jGroups);
destroyLocalReference(env, jUgi);
+ destroyLocalReference(env, jAttrString);
}
#else
Index: src/c++/libhdfs/hdfsJniHelper.c
===================================================================
--- src/c++/libhdfs/hdfsJniHelper.c (revision 806186)
+++ src/c++/libhdfs/hdfsJniHelper.c (working copy)
@@ -239,6 +241,7 @@
fprintf(stderr, "ERROR: jelem == NULL\n");
}
(*env)->SetObjectArrayElement(env, result, i, jelem);
+ (*env)->DeleteLocalRef(env, jelem);
}
return result;
}
Of course, this is not an official solution, not supported, may
explode, etc.
Brian
On Oct 13, 2009, at 12:40 PM, Huy Phan wrote:
> Hi Eli,
> You're right that the problem is resolved in 0.20 with function
> newInstance(), unfortunately my system's running on Hadoop 0.18.3
> and i'm still looking for a way to patch this version without
> affecting the current system.
>
> Regards,
> Huy Phan
>
> Eli Collins wrote:
>> Hey Huy,
>>
>> What version of hadoop are you using? I think HADOOP-4655 may have
>> resolved the issue you're seeing but I think is only in 20 and later.
>>
>> Thanks,
>> Eli
>>
>> On Mon, Oct 12, 2009 at 8:52 PM, Huy Phan <dachuy@gmail.com> wrote:
>>
>>> Hi All,
>>> I'm writing a multi-thread application using libhdfs in C, a
>>> known issue
>>> of HDFS is that the FileSystem API caches FileSystem handles and
>>> always
>>> returned the same FileSystem handle when called from different
>>> threads. It
>>> means even though I called hdfsConnect for many times, I should
>>> not call
>>> hdfsDisconnect in any single thread.
>>> This may lead to memory leak on system, do you know any workaround
>>> for this
>>> issue ?
>>>
>>>
>>
>>
|