hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-925) Make it harder to accidentally close a shared DFSClient
Date Fri, 04 Feb 2011 19:50:31 GMT

    [ https://issues.apache.org/jira/browse/HDFS-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990700#comment-12990700

Steve Loughran commented on HDFS-925:

The latest patch adds a test for expected behaviour. 

1. If you call FileSystem.get(conf) you get a shared client, if you ask for a new instance,
you get a different instance. 

2. If you close any of the shared instances, all other attempts to use that shared instance
will fail. 

3. If you call FileSystem.getNewInstance(conf), that configuration is different.

4. If, after closing a shared client instance, if you call FileSystem.get(conf) , you get
a new instance. 

This whole thing dates to an incompatible change made to cache shared clients. It doesn't
work across threads, not reliably. Either the clients are reference counted, or you call getNewInstance()
to get a thread safe instance. I think I've moved my code to the new API call, its been out
there for a while now, but this helps track down problems for people who haven't moved.

Even if people don't want the extra diagnostics (pity), the test could be used to verify the
semantics of FileSystem.get() and FileSystem.getNewInstance().

> Make it harder to accidentally close a shared DFSClient
> -------------------------------------------------------
>                 Key: HDFS-925
>                 URL: https://issues.apache.org/jira/browse/HDFS-925
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs client
>    Affects Versions: 0.21.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Minor
>         Attachments: HADOOP-5933.patch, HADOOP-5933.patch, HDFS-925.patch, HDFS-925.patch,
HDFS-925.patch, HDFS-925.patch
> Every so often I get stack traces telling me that DFSClient is closed, usually in {{org.apache.hadoop.hdfs.DFSClient.checkOpen()
}} . The root cause of this is usually that one thread has closed a shared fsclient while
another thread still has a reference to it. If the other thread then asks for a new client
it will get one -and the cache repopulated- but if has one already, then I get to see a stack
> It's effectively a race condition between clients in different threads. 

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message