hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1142) Lease recovery doesn't reassign lease when triggered by append()
Date Sat, 15 May 2010 00:01:44 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867747#action_12867747
] 

Konstantin Shvachko commented on HDFS-1142:
-------------------------------------------

Sorry, took me a while.
The idea with lease recovery after soft limit expiration was that it is done under the same
lease holder. Here is why.
Expiration of the soft limit means that somebody else can claim the lease, and if he succeeds,
then he is the new owner, if not, then not.
So here several clients may compete for the same lease. They will call {{create()}} and get
{{RecoveryInProgressException}} in response, which indicates that they should retry. The old
client if still there can also compete for the lease. It has an advantage over other clients,
because it does not need to go through the recovery process, but that seems fair.
If you reassign the lease to {{HDFS_NameNode}}, then its timeouts will reset, see {{reassignLease()}}.
And this will change the behavior. The clients trying to claim the file will be getting {{AlreadyBeingCreatedException}},
which means they cannot compete for the file anymore, and should fail.
Suppose there is only one new client, and the old owner had died already. The client tries
{{create()}}. This triggers lease recovery on NN, which starts the recovery under {{HDFS_NameNode}},
and throws {{RecoveryInProgressException}} back to the client. The client retries as expected,
and the next time gets {{AlreadyBeingCreatedException}}. Thinking that somebody else got lucky
before him the client bails out, which is not right as there is nobody esle competing for
the file. 
Does that makes sense? I don't see a problem here. Do you have failing tests because of that?
That by the way explains the parameter {{internalReleaseLease()}}

- Introduction of {{NN_LEASE_RECOVERY_HOLDER}} constant definitely makes sense.
- Persisting leases is not an issue if we do not reassign.
- For future reference it is very undesirable to declare public methods in {{FSNamesystem}}
to provide access to them from tests. The tests should either be in the right package or alternatively
the {{FSNamesystem}} methods should be access via {{NameNodeAdapter}}, that's why it was introduced
in the first place, see HDFS-563.


> Lease recovery doesn't reassign lease when triggered by append()
> ----------------------------------------------------------------
>
>                 Key: HDFS-1142
>                 URL: https://issues.apache.org/jira/browse/HDFS-1142
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.21.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Blocker
>         Attachments: hdfs-1142.txt, hdfs-1142.txt
>
>
> If a soft lease has expired and another writer calls append(), it triggers lease recovery
but doesn't reassign the lease to a new owner. Therefore, the old writer can continue to allocate
new blocks, try to steal back the lease, etc. This is for the testRecoveryOnBlockBoundary
case of HDFS-1139

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message