hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 张铎(Duo Zhang) <palomino...@gmail.com>
Subject Re: [DISCUSSION] Create a branch to work on non-blocking access to HDFS
Date Fri, 04 May 2018 00:25:16 GMT
Will prepare a design doc soon to roughly describe the things we want to do
and how we plan to do it, and also the undecided things, such as how to
support fan-out.

Thanks.

2018-05-04 4:54 GMT+08:00 Anu Engineer <aengineer@hortonworks.com>:

> Hi St.ack/Wei-Chiu,
>
> It is very kind of St.Ack to bring this question to HDFS Dev. I think this
> is a good feature to have. As for the branch question,
> HDFS-9924 branch is already open, we could just use that and I am +1 on
> adding Duo as a branch committer.
>
> I am not familiar with HBase code base, I am presuming that there will be
> some deviation from the current design
> doc posted in HDFS-9924. Would it be make sense to post a new design
> proposal on HDFS-9924?
>
> --Anu
>
>
>
> On 5/3/18, 9:29 AM, "Wei-Chiu Chuang" <weichiu@apache.org> wrote:
>
>     Given that HBase 2 uses async output by default, the way that code is
>     maintained today in HBase is not sustainable. That piece of code
> should be
>     maintained in HDFS. I am +1 as a participant in both communities.
>
>     On Thu, May 3, 2018 at 9:14 AM, Stack <stack@duboce.net> wrote:
>
>     > Ok with you lot if a few of us open a branch to work on a
> non-blocking HDFS
>     > client?
>     >
>     > Intent is to finish up the old issue "HDFS-9924 [umbrella]
> Nonblocking HDFS
>     > Access". On the foot of this umbrella JIRA is a proposal by the
>     > heavy-lifter, Duo Zhang. Over in HBase, we have a limited async DFS
> client
>     > (written by Duo) that we use making Write-Ahead Logs. We call it
>     > AsyncFSWAL. It was shipped as the default WAL writer in hbase-2.0.0.
>     >
>     > Let me quote Duo from his proposal at the base of HDFS-9924:
>     >
>     > ....We use lots of internal APIs of HDFS to implement the
> AsyncFSWAL, so it
>     > is expected that things like HBASE-20244
>     > <https://issues.apache.org/jira/browse/HBASE-20244>
>     > ["NoSuchMethodException
>     > when retrieving private method decryptEncryptedDataEncryptionKey
> from
>     > DFSClient"] will happen again and again.
>     >
>     > To make life easier, we need to move the async output related code
> into
>     > HDFS. The POC [attached as patch on HDFS-9924] shows that option 3
> [1] can
>     > work, so I would like to create a feature branch to implement the
> async dfs
>     > client. In general I think there are 4 steps:
>     >
>     > 1. Implement an async rpc client with option 3 [1] described above.
>     > 2. Implement the filesystem APIs which only need to connect to NN,
> such as
>     > 'mkdirs'.
>     > 3. Implement async file read. The problem is the API. For pread I
> think a
>     > CompletableFuture is enough, the problem is for the streaming read.
> Need to
>     > discuss later.
>     > 4. Implement async file write. The API will also be a problem, but a
> more
>     > important problem is that, if we want to support fan-out, the
> current logic
>     > at DN side will make the semantic broken as we can read uncommitted
> data
>     > very easily. In HBase it is solved by HBASE-14004
>     > <https://issues.apache.org/jira/browse/HBASE-14004> but I do not
> think we
>     > should keep the broken behavior in HDFS. We need to find a way to
> deal with
>     > it.
>     >
>     > Comments welcome.
>     >
>     > Intent is to make a branch named HDFS-9924 (or should we just do a
> new
>     > JIRA?) and to add Duo as a feature branch committer. If all goes
> well,
>     > we'll call for a merge VOTE.
>     >
>     > Thanks,
>     > St.Ack
>     >
>     > 1.Option 3:  "Use the old protobuf rpc interface and implement a new
> rpc
>     > framework. The benefit is that we also do not need port unification
> service
>     > at server side and do not need to maintain two implementations at
> server
>     > side. And one more thing is that we do not need to upgrade protobuf
> to
>     > 3.x."
>     >
>
>
>
>     --
>     A very happy Hadoop contributor
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message