hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-12910) Add new FileSystem API to support asynchronous method calls
Date Thu, 10 Mar 2016 11:29:40 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-12910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15189164#comment-15189164
] 

Steve Loughran commented on HADOOP-12910:
-----------------------------------------

Actually, one more thing to define in HDFS-9924 and include any specification is: linearlizability/serializability
guarantees

Specificlally will this api make the following guarantee

h3. Serializability/Linearizibility/Atomicity
# The atomicity requirements/guarantees of the {{FileSystem}} API are unchanged. (and those
blobstores which break them, still broken)

# A series of operations, issued from a single thread against the same instance of {{FutureFileSystem}},
will always be executed in the order in which they are submitted.

# If at time {{t}}, thread A issues a request, then in the same process, at time {{t1 >
t}}, thread B issues a filesystem request *against the same instance of FutureFileSystem*,
then the request by thread A will be executed before the request in thread B. 
That is: requests are never-reordered, in a single FS instance, they are executed in the order
of submission, irrespective of which thread is making the submission. 

# If at time {{t}}, thread A issues a request, then in the same process, at time {{t1 >
t}}, thread B issues a filesystem request *against a different instance of FutureFileSystem*,
then there are no guarantees of the order of execution. Different queue: different outcome.

There's also the ordering across processes and systems. Here you'd need to say something like
"they are processed in the strict order the NN receives them". They may be interleaved, but
the actions of each {{FutureFileSystem}} instance are executed in a linear order.

Also: parameter/state validation. Basic parameter validity may be checked in the initial call
(null values, illegal values), but all request validation operations which examine the observable
state of the FS will not take place until the future is actually executed, Thus, the state
of the filesystem may change between the call being made and it being executed. 

If you don't spell this out, then the semantics of 
{code}
delete("/c");
rename("/a","/c");
rename("/b","/a");
{code}
are undefined (assuming {{/a}} and {{/b}} refer to paths for which {{exists()}} holds at the
time of the call. The cross-thread serialization guarantee is needed to guarantee that any
two threads, synchronized by any means, will have the ordering of their requests executed
according to the in-process {{happens-before}} guarantees of the synchronization mechanism.

> Add new FileSystem API to support asynchronous method calls
> -----------------------------------------------------------
>
>                 Key: HADOOP-12910
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12910
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Xiaobing Zhou
>
> Add a new API, namely FutureFileSystem (or AsynchronousFileSystem, if it is a better
name).  All the APIs in FutureFileSystem are the same as FileSystem except that the return
type is wrapped by Future, e.g.
> {code}
>   //FileSystem
>   public boolean rename(Path src, Path dst) throws IOException;
>   //FutureFileSystem
>   public Future<Boolean> rename(Path src, Path dst) throws IOException;
> {code}
> Note that FutureFileSystem does not extend FileSystem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message