[ https://issues.apache.org/jira/browse/HDFS-895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Lipcon updated HDFS-895:
-----------------------------
Attachment: hdfs-895-review.txt
Took some time to update this to newest trunk, based on the fixes in the 20-append patch.
This attachment hdfs-895-review.txt shows the patch broken up into three separate commits
- first two refactors and then the actual parallel sync feature. It should be easier to understand
the patch and review it this way. Will upload the total patch as well.
> Allow hflush/sync to occur in parallel with new writes to the file
> ------------------------------------------------------------------
>
> Key: HDFS-895
> URL: https://issues.apache.org/jira/browse/HDFS-895
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs client
> Affects Versions: 0.22.0
> Reporter: dhruba borthakur
> Assignee: Todd Lipcon
> Fix For: 0.22.0
>
> Attachments: hdfs-895-0.20-append.txt, hdfs-895-20.txt, hdfs-895-review.txt,
hdfs-895-trunk.txt, hdfs-895.txt
>
>
> In the current trunk, the HDFS client methods writeChunk() and hflush./sync are syncronized.
This means that if a hflush/sync is in progress, an applicationn cannot write data to the
HDFS client buffer. This reduces the write throughput of the transaction log in HBase.
> The hflush/sync should allow new writes to happen to the HDFS client even when a hflush/sync
is in progress. It can record the seqno of the message for which it should receice the ack,
indicate to the DataStream thread to star flushing those messages, exit the synchronized section
and just wai for that ack to arrive.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
|