hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@hortonworks.com>
Subject Re: ClientProtocol create、mkdirs 、rename and delete methods are not Idempotent
Date Sun, 04 Nov 2012 19:34:56 GMT
On 4 November 2012 17:25, lei liu <liulei412@gmail.com> wrote:

> I want to know what applications are idempotent or not idempotent? and
> Why? Could you give me a example.

When you say "idempotent", I presume you mean the operation happens
"at-most-once"; ignoring the degenerate case where all requests are

you can take operations that fail if their conditions aren't met (delete
path named="something") being the simplest. the operation can send an error
back "file not found', but the client library can then downgrade that to an
idempotent assertion: "when the acknowledgment was send from the namenode,
there was nothing at the end of this path". Which will hold on a replay,
though if someone creates a file in between, that replay could be

Now what about move(src,dest)?

if it succeeds, then there is no src path, as it is now at "dest".

What happens if you call it a second time? There is no src, only dest. You
can't report that back as a success as it is clearly a failure: no src, no
dest. It's hard to convert that into an assertion on the observable state
of the system as the state doesn't reflect the history, so you need some
temporal logic in there too:: at time t0 there existed a directory src, at
time t1 the directory src no longer existed and its contents were now found
under directory "dest".

And again, what happens if worse someone else did something in between,
created a src directory (which it could do, given that the first one has
been renamed dest), the operation replays and the move takes place twice
-you've just crossed into at-least-once operations, which is not what you

At this point I'm sure you are thinking of having some kind of transaction
journal, recording that at time Tn, transaction Xn moved the dir. Which
means you have to start to collect a transaction log of what happened. Now
effectively HDFS is a journalled file system, it does record a lot of
things. It just doesn't record user transactions with it, or rescan the log
whenever any operation comes in, so as to decided what to ignore.

Or you just skip the filesystem changes and have some data structure
recording "recent" transaction IDs; ignore repeated requests with the same
IDs. Better, though you'd need to make that failure resistant -it's state
must propagate to the journal and any failover namenodes so that a
transaction replay will be idempotent even if the filesystem fails over
between the original and replayed transaction. And of course all of this
needs to be atomic with the filesystem state changes...

Summary: It gets complicated fast. Throwing errors back to the caller makes
life a lot simpler and lets the caller choose its own outcome -even though
that's not always satisfactory.

Alternatively: it's not that people don't want globally distributed
transactions -it's just hard.


> 2012/10/29 Ted Dunning <tdunning@maprtech.com>
>> Create cannot be idempotent because of the problem of watches and
>> sequential files.
>> Similarly, mkdirs, rename and delete cannot generally be idempotent.  In
>> particular applications, you might find it is OK to treat them as such, but
>> there are definitely applications where they are not idempotent.
>> On Sun, Oct 28, 2012 at 2:40 AM, lei liu <liulei412@gmail.com> wrote:
>>> I think these methods should are idempotent, these methods should be repeated
>>> calls to be harmless by same client.
>>> Thanks,
>>> LiuLei

View raw message