Return-Path: X-Original-To: apmail-accumulo-notifications-archive@minotaur.apache.org Delivered-To: apmail-accumulo-notifications-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 79D0C18910 for ; Wed, 2 Mar 2016 20:12:26 +0000 (UTC) Received: (qmail 96999 invoked by uid 500); 2 Mar 2016 20:12:18 -0000 Delivered-To: apmail-accumulo-notifications-archive@accumulo.apache.org Received: (qmail 96971 invoked by uid 500); 2 Mar 2016 20:12:18 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 96839 invoked by uid 99); 2 Mar 2016 20:12:18 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Mar 2016 20:12:18 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 2617D2C1F62 for ; Wed, 2 Mar 2016 20:12:18 +0000 (UTC) Date: Wed, 2 Mar 2016 20:12:18 +0000 (UTC) From: "Josh Elser (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ACCUMULO-4156) Tunable replication frequency MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ACCUMULO-4156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15176380#comment-15176380 ] Josh Elser commented on ACCUMULO-4156: -------------------------------------- bq. I didn't see any special offsets for WAL files Sorry, I meant to say, definitively, that this doesn't exist. I think this was something I considered as a future improvement which would enable more responsive replication. bq. I think you'd need some marker for a WAL that lives through a flush so you don't do a double-insert incase of a failure after a flush. But a flush is just pushing the IMM to disk -- the records should already be recorded in the WAL by the time they make it into the IMM. Am I misunderstanding? > Tunable replication frequency > ----------------------------- > > Key: ACCUMULO-4156 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4156 > Project: Accumulo > Issue Type: Improvement > Components: core > Affects Versions: 1.7.1 > Reporter: William Slacum > Fix For: 1.8.0 > > > Currently, replication happens when a write ahead log file is closed. The only parameter to toggle when this event occurs is write ahead log size, and is only applicable to the tablet servers themselves. > By default this means that when replication happens isn't tied to the table it is configured on, but also exogenous factors such as total write load and failures. If a system receives ~100MB/day/TServer, and the WAL size is its default 1GB, it will take 10 days for any replication event to occur. Another possibility is that an unreplicated table is receiving many writes, which will cause more frequent replication events, but proportionally the work will involve less data for the table being replicated. > I don't have a specific implementation in mind, but I'd like to see a solution that involves isolating the work down to specific table events such as time-since-last-replication and data-added-since-last-replication. > [~elserj] has had some ideas about doing things incrementally within WAL files (ie, replicating between two sync points) that can also help with this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)