hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Mackrory (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-14020) Optimize dirListingUnion
Date Tue, 24 Jan 2017 21:58:26 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-14020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Sean Mackrory updated HADOOP-14020:
    Attachment: HADOOP-14020-HADOOP-13345.001.patch

Attaching a patch with the optimization and tests. I originally went with a separate property
to enable write back. The more I think about it the more I think it makes perfect sense to
just use authoritative mode for this. Let me know if you disagree. It eliminates a few lines
from the patch. I'm unable to build and test it because of issues with the DynamoDB Local
repo that I'm having trouble working around, so just posting this for initial comment. I would
vote to end up going with the next patch (that just uses the authoritative mode config) unless
anyone can think of a use case that justifies 2 separate configs.

> Optimize dirListingUnion
> ------------------------
>                 Key: HADOOP-14020
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14020
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Sean Mackrory
>            Assignee: Sean Mackrory
>         Attachments: HADOOP-14020-HADOOP-13345.001.patch
> There's a TODO in dirListingUnion:
> {quote}// TODO optimize for when allowAuthoritative = false{quote}
> There will be cases when we can intelligently avoid a round trip: if S3A results are
a subset or the metadatastore results (including them being equal or empty) then writing back
will do nothing (although perhaps that should set the authoritative flag if it isn't set already).
> There may also be cases where users want to just skip that altogether. It's wasted work
if authoritative mode is disabled, so perhaps we want to trigger a skip if that's false, or
perhaps it should be a separate property. First one makes for simpler config, second is more

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message