hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6765) Log INodeID instead of full path in edit log when closing a file
Date Tue, 29 Jul 2014 19:02:39 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14078172#comment-14078172
] 

Colin Patrick McCabe commented on HDFS-6765:
--------------------------------------------

bq. I'm a bit leery due to recent edit log corruptions. Replaying the edit based on inode
instead of path isn't replaying the edit as originally issued which may mask edit bugs.

I hope I'm not misinterpreting what you're asking here, but the RPC "originally issued" by
the DFSClient contains an inode ID, which we use in preference to the path if it's present.

You guys already know this, but just to be 100% clear, the problem with closing a file by
the path is that it doesn't work in the presence of renames.  The old way of working around
this problem was for the DFSClient to maintain a map of open files, and whenever it did a
rename, it would modify that map.  Of course, this fails horribly if some __other__ client
renamed the file or ancestor.

So the new way of doing things (introduced in HDFS-6294) is to basically ignore that path
provided in the completeFile RPC (we would have removed it from the RPC, but we didn't want
to break wire compatibility).  I think we use that path in log messages in some cases (like
if the inode isn't found), but that's about it.  For everything important, we use the inode
ID since that always works, even in the presence of renames.  The paths we use come from that
inode ID, not from the path in the completeFile RPC, which we assume is a lie.

So in some sense, translating that inode ID provided in completeFile back to a path for the
edit log is an extra step.

bq. Other than the inconsistency with other edit ops, is this going to break compatibility
for the edits viewer?

So, the compatibility guarantee with edit logs is that a new client can read old edits, but
not vice versa.  There is nothing like PB's "optional" fields in the current edit log format...
any new stuff whatsoever causes a hard failure in old clients.  This is enforced by code bailing
out if the edit log version is newer than what it knows it can support.  It won't even try.

This isn't really a problem because anyone using oev is an expert doing some kind of advanced
debugging.  And the error message is pretty clear... get yourself a version of oev that matches
the edit log before you attempt the debugging.

We could handle this JIRA by looking at the edit log version number (provided in all edit
logs), and either looking for a string or an inode in OP_CLOSE depending on that version.
 Alternately, we could create a new edit log op for close-with-inode-id.  I kind of like the
"version number" strategy... that's what the version number is for, after all, and we have
too many legacy edit types anyway...

bq. I'll look at HDFS-6757 but could you elaborate on what impact this has on performance
(other than re-building the path?) and what cleanup it blocks?

Putting full paths into the edit log is kind of unfortunate, since they may be very long (especially
if people have long pathnames).  An inode ID is much smaller.

The biggest reason why I think this is a good change isn't performance at all, though, but
correctness and simplicity.  If you have to repair a damaged edit log, and you miss a rename,
you'll be a lot happier if the subsequent close operations are done by inode ID than if they're
done by path.  Our inode IDs are globally unique, which makes things nice and simple.

On the other hand, handling paths can get hairy really fast.  We have places in FSNamesystem
comparing paths with String#equals, and relying on the fact that all paths are properly canonicalized.
 That's brittle.  And of course, any time you're dealing with paths, you're also dealing with
namesystem lookups, which will generate garbage and burn CPU.

> Log INodeID instead of full path in edit log when closing a file
> ----------------------------------------------------------------
>
>                 Key: HDFS-6765
>                 URL: https://issues.apache.org/jira/browse/HDFS-6765
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Haohui Mai
>
> When closing a file, currently the edit log records the full path instead of the inode
id. The performance is sub-optimal, and it prevents further clean ups of the code (e.g., HDFS-6757)
> This jira proposes to record inode id instead of the full path in edit logs when closing
a file.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message