hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mukul Kumar Singh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11887) XceiverClientManager should close XceiverClient on eviction from cache
Date Fri, 02 Jun 2017 17:00:08 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035039#comment-16035039
] 

Mukul Kumar Singh commented on HDFS-11887:
------------------------------------------

Hi [~cheersyang], thanks for the review.
My apologies that I didn't update the context on this bug. There are two problems in this
bug which I found out while solving the issue mentioned in the bug.

Issues fixed in this patch
===================
1) There is a StackOverFlow which happens when the client is re-enqueued in case there is
reference on the client.
This will happen because there is not enough space in the cache, and adding the *removed*
element back will trigger the evict operation again.
This will cause the stack overflow error. Hence re-adding an element to cache on eviction
is not right as it might trigger a stack overflow.

For an example please look at the following test results for TestBufferManager
https://builds.apache.org/job/PreCommit-HDFS-Build/19742/testReport/org.apache.hadoop.cblock/TestBufferManager/testRepeatedBlockWrites/

2) The second issue is the same as the one mentioned in the title of this bug, which is about
closing the client when there are no references on it.

Details about the patch
==================
With the updated patch on the bug, the idea is to close the client session only when
1) it has been evicted from the cache
2) there are no references on the client.

This will ensure that client is not closed while
a) there are any pending operations on the client (because of reference count)
b) the client has been evicted from cache (This element has been deemed to be not usable in
cache, "evict cache flag")

Having the combination of both these checks will ensure that
a) no clients are leaked
b) no active references on the cache will work on a closed client

Comments
==========
1) as explained earlier, if there are more references on the client, it will be closed when
all the references drop down to 0.
2) I agree to the point, even I wanted to keep the interface same as earlier, but releaseClient
now needs to work on RefCountedXceiverClient, hence the change.
3) The client still works because there are still active references on it, hence it hasn't
been closed. However after releasing the client on line#134, the client is closed and then
the call fails.

In summary, this patch tries to ensure that
1) clients are not leaked
2) there are no errors in the code (stack overflow)
3) clients are kept in cache to ensure that they are not recreated and also so that clients
are reused effectively.

Please let me know if any more details are needed. I should had added them while updating
the bug.





> XceiverClientManager should close XceiverClient on eviction from cache
> ----------------------------------------------------------------------
>
>                 Key: HDFS-11887
>                 URL: https://issues.apache.org/jira/browse/HDFS-11887
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>            Reporter: Mukul Kumar Singh
>            Assignee: Mukul Kumar Singh
>         Attachments: HDFS-11887-HDFS-7240.001.patch, HDFS-11887-HDFS-7240.002.patch
>
>
> XceiverClientManager doesn't close client on eviction which can leak resources.
> {code}
> public XceiverClientManager(Configuration conf) {
> .
> .
> .
>             public void onRemoval(
>                 RemovalNotification<String, XceiverClientWithAccessInfo>
>                   removalNotification) {
>               // If the reference count is not 0, this xceiver client should not
>               // be evicted, add it back to the cache.
>               WithAccessInfo info = removalNotification.getValue();
>               if (info.hasRefence()) {
>                 synchronized (XceiverClientManager.this.openClient) {
>                   XceiverClientManager.this
>                       .openClient.put(removalNotification.getKey(), info);
>                 }
>               }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message