hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen Liang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12519) Ozone: Add a Lease Manager to SCM
Date Tue, 10 Oct 2017 19:21:01 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199221#comment-16199221
] 

Chen Liang commented on HDFS-12519:
-----------------------------------

This is actually a quite complex work, seems lots of thoughts have been put into, thanks [~nandakumar131]
for working on this! Some comments:

1. {{LeaseManager}}, how about adding a heap in addition to activeLeases? (e.g. use {{PriorityQueue}},
where timeout is the comparing key). This way, acquire and release change from O(1) to O(log
n) operation, but {{LeaseMonitor}} no longer needs to go through the entire activeLeases map
all the time, but only needs to pop out all the expired ones from the top of the heap and
then look at the top to determine how long it needs to sleep, making it from O( n) to O(log
n) (for expired ones) and O(1) (for determine sleep time). What do you think?

2. {{LeaseManager#LeaseMonitor#run}} About the interrupt in acquire, will the following case
happen?
a. the lease monitor thread wakes up and is doing the timeout check (the for loop)
b. the same time another acquire call comes in, which calls leaseMonitorThread.interrupt()
If this can happen, will the lease monitor thread got stopped? because seems the catch InterruptedException
is only for interrupt during sleep, but not for interrupt during the lease checking loop.
Also, if {{acquire}} can interrupt leaseMonitor when it is doing the check, it seems it is
possible that whenever leaseMonitor is checking timeout, an acquire call may come in and interrupt
leaseMonitor, in this case LeaseMonitor may never be able to walk through all leases and remove
the all the expired ones. But since this seems to be a bit too extreme corner case and only
happens with large number of lease acquire calls, I'm fine with not handling it for now...
 

> Ozone: Add a Lease Manager to SCM
> ---------------------------------
>
>                 Key: HDFS-12519
>                 URL: https://issues.apache.org/jira/browse/HDFS-12519
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>            Reporter: Anu Engineer
>            Assignee: Nandakumar
>              Labels: ozoneMerge
>         Attachments: HDFS-12519-HDFS-7240.000.patch, HDFS-12519-HDFS-7240.001.patch,
HDFS-12519-HDFS-7240.002.patch
>
>
> Many objects, including Containers and pipelines can time out during creating process.
We need a way to track these timeouts. This lease Manager allows SCM to hold a lease on these
objects and helps SCM timeout waiting for creating of these objects.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message