hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mingliang Liu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.
Date Fri, 18 Nov 2016 02:44:58 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Mingliang Liu updated HADOOP-13449:
-----------------------------------
    Attachment: HADOOP-13449-HADOOP-13345.005.patch

Thanks for the discussion, [~fabbri]. That's very helpful.

{quote}
for v1, you could always return authoritative = false. 
{quote}
Yes, it's the current patch. Let's address this as a follow-up JIRA after the [HADOOP-13651]
and this both be committed.

{quote}
The interface allows any of these behaviors.... The filesystem is responsible for ensuring
that the delete to /a must be recursive since it is not empty. MetadataStore explicitly does
not do that.
{quote}
Agreed. For example, {{delete(path)}} does not check the directory path being empty.

{quote}
You either have to (A) pay money to store an extra copy of your metadata forever, or (B) spend
money and time hydrating the MetadataStore each time you start a cluster.
{quote}
The metadata size is considered small and the price of DDB storage is low comparing with read/write
operations pricing. If I have to choose, (A) makes more sense.

{quote}
and we don't assume everything is always in DynamoDB, it makes recovery much easier
{quote}
That's very valid. Altering S3 and MetadataStore is not atomic.

{quote}
The other concern is that I just don't understand why you would want to do the preloading.
{quote}
You mean import? I suppose not. For read/write existing s3 buckets, importing the structure
first seems a prerequisite unless we assume it discovers/converges fast or we reach little
consistency.
I guess you mean the constrictions on the pre-creating parent directories. I re-read the design
doc and [HADOOP-13651] patch, and think you made a good point about this. Let S3AFileSystem
ensure the contract.

Moreover, I now think storing the is_empty bit in DynamoDB is not ideal. Maintaining it needs
non-trivial effort and it's easy to make it wrong. Perhaps we can query via parent directories
as HASH key when we need this information. This is non-trivial either; I'll think about this
as my next work. We can either fix this in next patch, or I'll work on a follow-up JIRA.

If this patch is still in question, a conference call will be very helpful. Let's schedule
next week. [~stevel@apache.org] is traveling this week.

[~eddyxu] you have more comments since I revised the latest patch?

Thank you,

> S3Guard: Implement DynamoDBMetadataStore.
> -----------------------------------------
>
>                 Key: HADOOP-13449
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13449
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Chris Nauroth
>            Assignee: Mingliang Liu
>         Attachments: HADOOP-13449-HADOOP-13345.000.patch, HADOOP-13449-HADOOP-13345.001.patch,
HADOOP-13449-HADOOP-13345.002.patch, HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch,
HADOOP-13449-HADOOP-13345.005.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message