hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-15193) add bulk delete call to metastore API & DDB impl
Date Thu, 22 Feb 2018 10:36:02 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372638#comment-16372638

Steve Loughran commented on HADOOP-15193:

I'm thinking of doing this with some explicit {{BulkOperationInfo extends Closeable}} class

BulkOperationInfo initiateDirectoryDelete(path)
void deleteBatch(BulkOperationInfo, List<Path>) // every path must be under the path
specified in the bulk operation
void completeBulkOperation(BulkOperationInfo, boolean wasSuccessful) 

This lines us up for setting up other bulk ops, like an explicit rename.

Why this way? It allows us to tell the store that the batches are all part of the same rmdir
call, and that there is little/no need to create any parent dir markers, etc, etc, because
everything is expected to work. The complete call can do that and choose what to use as a
success/failure marker.

The base impl will do nothing but very that in a batch delete, all paths are valid, then issue
1 by 1; nothing done in complete()

> add bulk delete call to metastore API & DDB impl
> ------------------------------------------------
>                 Key: HADOOP-15193
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15193
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.0.0
>            Reporter: Steve Loughran
>            Priority: Major
> recursive dir delete (and any future bulk delete API like HADOOP-15191) benefits from
using the DDB bulk table delete call, which takes a list of deletes and executes. Hopefully
this will offer better perf. 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message