accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "William Slacum (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-4156) Tunable replication frequency
Date Wed, 02 Mar 2016 21:31:18 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-4156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15176521#comment-15176521
] 

William Slacum commented on ACCUMULO-4156:
------------------------------------------

I was specifically talking about just how Accumulo handles WAL replay in the face of failures,
unrelated to replication. We clarified offline and to summarize: there is the possibility
of piggybacking off the flush ID used to prevent WAL data from being replayed after it has
been flushed to disk via minor compaction. 

> Tunable replication frequency
> -----------------------------
>
>                 Key: ACCUMULO-4156
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4156
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.7.1
>            Reporter: William Slacum
>             Fix For: 1.8.0
>
>
> Currently, replication happens when a write ahead log file is closed. The only parameter
to toggle when this event occurs is write ahead log size, and is only applicable to the tablet
servers themselves.
> By default this means that when replication happens isn't tied to the table it is configured
on, but also exogenous factors such as total write load and failures. If a system receives
~100MB/day/TServer, and the WAL size is its default 1GB, it will take 10 days for any replication
event to occur. Another possibility is that an unreplicated table is receiving many writes,
which will cause more frequent replication events, but proportionally the work will involve
less data for the table being replicated.
> I don't have a specific implementation in mind, but I'd like to see a solution that involves
isolating the work down to specific table events such as time-since-last-replication and data-added-since-last-replication.
> [~elserj] has had some ideas about doing things incrementally within WAL files (ie, replicating
between two sync points) that can also help with this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message