hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs
Date Sat, 19 Aug 2017 15:54:01 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16134151#comment-16134151

Steve Loughran commented on HADOOP-14520:

I've been looking at the current page blob code, and I've seen something else I think may
need fixing; how things close

* {{PageBlobOutputStream.close()}} call is completely {{synchronized}}, even during a long
upload. Which means that until that upload completes, all calls may block
* I don't see the flush or wrtie calls checking for the stream being open before doing anything,
meaning that the only way that the state will be checked is internally, in  {{iopool.execute()}}
or elsewhere.

I think here it'd be good to move to an atomic boolean to manage closed state, and use it
as the guard to the operations.

> WASB: Block compaction for Azure Block Blobs
> --------------------------------------------
>                 Key: HADOOP-14520
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14520
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/azure
>    Affects Versions: 3.0.0-alpha3
>            Reporter: Georgi Chalakov
>            Assignee: Georgi Chalakov
>         Attachments: HADOOP-14520-006.patch, HADOOP-14520-05.patch
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync call. When
the number of blocks is above 32000, next hflush/hsync triggers the block compaction process.
Block compaction replaces a sequence of blocks with one block. From all the sequences with
total length less than 4M, compaction chooses the longest one. It is a greedy algorithm that
preserve all potential candidates for the next round. Block Compaction for WASB increases
data durability and allows using block blobs instead of page blobs. By default, block compaction
is disabled. Similar to the configuration for page blobs, the client needs to specify HDFS
folders where block compaction over block blobs is enabled. 
> Results for HADOOP-14520-05.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 707, Failures: 0, Errors: 0, Skipped: 119

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message