hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7718) DFSClient objects created by AbstractFileSystem objects created by FileContext are not closed and results in thread leakage
Date Sat, 07 Feb 2015 02:41:35 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310483#comment-14310483
] 

Colin Patrick McCabe commented on HDFS-7718:
--------------------------------------------

I apologize for the review delay.  Too many other things to look at.

{code}
 diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/META-INF/services/org.apache.hadoop.crypto.key.KeyProviderFactory
b/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/META-INF/services/org.apache.hadoop.crypto.key.KeyProviderFactory
{code}
This shouldn't be checked in?

{code}
   public KeyProvider getKeyProvider() {
-    return provider;
+    try {
+      return clientContext.getKeyProviderCache().get(conf);
+    } catch (IOException e) {
+      LOG.error("Could not obtain KeyProvider !!", e);
+      return null;
+    }
   }
{code}

Can we just have {{KeyProviderCache#get}} deal with this?  Handling exceptions in two places
seems like one too many.

{code}
+  public KeyProvider get(final Configuration conf) throws IOException {
+    URI kpURI = createKeyProviderURI(conf);
+    if (kpURI == null) {
+      return null;
+    }
+    try {
+      return cache.get(kpURI, new Callable<KeyProvider>() {
+        @Override
+        public KeyProvider call() throws Exception {
+          return DFSUtil.createKeyProvider(conf);
+        }
+      });
+    } catch (Exception e) {
+      LOG.error("Could not create KeyProvider for DFSClient !!", e.getCause());
+      throw new IOException(e);
+    }
+  }
{code}
Let's just have this return null on failure, as discussed above...

{code}
+  private URI createKeyProviderURI(Configuration conf) throws IOException{
+    final String providerUriStr =
+        conf.get(DFSConfigKeys.DFS_ENCRYPTION_KEY_PROVIDER_URI, null);
+    // No provider set in conf
+    if (providerUriStr == null) {
+      LOG.error("Could not find uri with key ["
+          + DFSConfigKeys.DFS_ENCRYPTION_KEY_PROVIDER_URI
+          + "] to create a keyProvider !!");
+      return null;
+    }
{code}

Rather than looking up {{providerUriStr}} again, let's pass it in from the {{get}} function.
 We already checked in that function that it was non-null, so we don't need to check again.

{code}
+            try {
+              notification.getValue().close();
+            } catch (IOException e) {
+              LOG.error(
+                  "Error closing KeyProvider with uri ["
+                      + notification.getKey() + "]", e);
+              ;
+            }
{code}

Let's catch {{Throwable}} here rather than {{IOException}}, in case our {{KeyProviders}} are
wacky.

+1 pending those fixes.  Thanks, [~asuresh].

> DFSClient objects created by AbstractFileSystem objects created by FileContext are not
closed and results in thread leakage
> ---------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-7718
>                 URL: https://issues.apache.org/jira/browse/HDFS-7718
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Arun Suresh
>            Assignee: Arun Suresh
>         Attachments: HDFS-7718.1.patch, HDFS-7718.2.patch
>
>
> Currently, the {{FileContext}} class used by clients such as (for eg. {{YARNRunner}})
creates a new {{AbstractFilesystem}} object on initialization.. which creates a new {{DFSClient}}
object.. which in turn creates a KeyProvider object.. If Encryption is turned on, and https
is turned on, the keyprovider implementation (the {{KMSClientProvider}}) will create a {{ReloadingX509TrustManager}}
thread per instance... which are never killed and can lead to a thread leak



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message