hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-15621) s3guard: implement time-based (TTL) expiry for DynamoDB Metadata Store
Date Sun, 02 Sep 2018 10:13:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16601163#comment-16601163
] 

Hadoop QA commented on HADOOP-15621:
------------------------------------

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 20s{color} | {color:blue}
Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  0s{color} |
{color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m  0s{color}
| {color:green} The patch appears to include 3 new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 22s{color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 29s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 21s{color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 35s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m  8s{color}
| {color:green} branch has no errors when building and testing our client artifacts. {color}
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 44s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 23s{color} |
{color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 33s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 26s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 26s{color} | {color:green}
the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  0m 15s{color}
| {color:orange} hadoop-tools/hadoop-aws: The patch generated 4 new + 4 unchanged - 0 fixed
= 8 total (was 4) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 30s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m  0s{color}
| {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 32s{color}
| {color:green} patch has no errors when building and testing our client artifacts. {color}
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  3s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 26s{color} |
{color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 39s{color} | {color:green}
hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 45s{color}
| {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 62m  1s{color} | {color:black}
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HADOOP-15621 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12938037/HADOOP-15621.001.patch
|
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  mvnsite  unit
 shadedclient  findbugs  checkstyle  |
| uname | Linux 9502f936dfe3 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018
x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / eed8415 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | https://builds.apache.org/job/PreCommit-HADOOP-Build/15124/artifact/out/diff-checkstyle-hadoop-tools_hadoop-aws.txt
|
|  Test Results | https://builds.apache.org/job/PreCommit-HADOOP-Build/15124/testReport/ |
| Max. process+thread count | 330 (vs. ulimit of 10000) |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/15124/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> s3guard: implement time-based (TTL) expiry for DynamoDB Metadata Store
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-15621
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15621
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.0.0-beta1
>            Reporter: Aaron Fabbri
>            Assignee: Gabor Bota
>            Priority: Minor
>         Attachments: HADOOP-15621.001.patch
>
>
> Similar to HADOOP-13649, I think we should add a TTL (time to live) feature to the Dynamo
metadata store (MS) for S3Guard.
> Think of this as the "online algorithm" version of the CLI prune() function, which is
the "offline algorithm".
> Why: 
> 1. Self healing (soft state): since we do not implement transactions around modification
of the two systems (s3 and metadata store), certain failures can lead to inconsistency between
S3 and the metadata store (MS) state.  Having a time to live (TTL) on each entry in S3Guard
means that any inconsistencies will be time bound.  Thus "wait and restart your job" becomes
a valid, if ugly, way to get around any issues with FS client failure leaving things in a
bad state.
> 2. We could make manual invocation of `hadoop s3guard prune ...` unnecessary, depending
on the implementation.
> 3. Makes it possible to fix the problem that dynamo MS prune() doesn't prune directories
due to the lack of true modification time.
> How:
> I think we need a new column in the dynamo table "entry last written time".  This is
updated each time the entry is written to dynamo.
> After that we can either
> 1. Have the client simply ignore / elide any entries that are older than the configured
TTL.
> 2. Have the client delete entries older than the TTL.
> The issue with #2 is it will increase latency if done inline in the context of an FS
operation. We could mitigate this some by using an async helper thread, or probabilistically
doing it "some times" to amortize the expense of deleting stale entries (allowing some batching
as well).
> Caveats:
> - Clock synchronization as usual is a concern. Many clusters already keep clocks close
enough via NTP. We should at least document the requirement along with the configuration knob
that enables the feature.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message