hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Himanshu Gupta <para_...@yahoo.com>
Subject FileSystem.get(Uri,Configuration,String) caching issue
Date Mon, 10 Sep 2012 11:29:48 GMT
I am using FileSystem.get(URI uri, Configuration conf,
String user) to create FileSystem implementation(LocalFileSystem in this case)
instances. From what I know, FileSystem internally has a cache to retain the
objects based on uri and user. So if I call FileSystem.get(..) method multiple
times with same uri and user, then only one instance of LocalFileSystem needs
to be created and cached. However, I observed(with hadoop-core-1.0.0) that each
call creates a new instance of LocalFileSystem and puts it in the cache leading
to memory issues.
Please see the code below and let me know if I am doing
something wrong.
import java.net.URI;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
public class FileSystemCacheIssue {
    private static FileSystem
getFileSystem(String user) throws Exception {
conf = new Configuration();
conf.set("fs.default.name", "file:///");
FileSystem.get(new URI("file:///"),conf,user);
    public static void main(String[] args)
throws Exception {
        for(int i = 0; i
< 1000; i++) {
fs = getFileSystem("himanshg");
        //put a
breakpoint here and look at the heap dump for number of LocalFileSystem
Ideally I expect it to be 1, but there are 1001
System.out.println("Keep your debugger here and check.");
View raw message