Return-Path: X-Original-To: apmail-accumulo-notifications-archive@minotaur.apache.org Delivered-To: apmail-accumulo-notifications-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E0FDE171FA for ; Sun, 19 Oct 2014 06:44:35 +0000 (UTC) Received: (qmail 76180 invoked by uid 500); 19 Oct 2014 06:44:34 -0000 Delivered-To: apmail-accumulo-notifications-archive@accumulo.apache.org Received: (qmail 76140 invoked by uid 500); 19 Oct 2014 06:44:34 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 76129 invoked by uid 99); 19 Oct 2014 06:44:34 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 19 Oct 2014 06:44:34 +0000 Date: Sun, 19 Oct 2014 06:44:33 +0000 (UTC) From: "Christopher Tubbs (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ACCUMULO-3236) Clone table into an existing table MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ACCUMULO-3236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14176231#comment-14176231 ] Christopher Tubbs commented on ACCUMULO-3236: --------------------------------------------- bq. Has this discussion moved from "how should we do this" to "what should we call it?" I think so. bq. I do think "clone" is a composition of the snapshot'ing functionality, table creation, import, ... I think "snapshot" implies some explicit view at a point in time of a table. Because of the way clone is implemented, I would probably describe the relationship between "clone" and "snapshot" somewhat the other way around. Clone is not so much a composition of a snapshot functionality. Rather, snapshotting can be achieved using clone, along with ensuring the table is offline (or at least, ingest is halted and the table is completely flushed). Clone by itself, can result in some very unexpected behavior if you assume it represents a snapshot in time of a table. For instance, a clone can pick up new data that was bulk imported to a tablet, but not older data that is waiting to be flushed... so even within a tablet, you can get gaps in time from clone, never mind the expectations for a snapshot view for an entire table. > Clone table into an existing table > ---------------------------------- > > Key: ACCUMULO-3236 > URL: https://issues.apache.org/jira/browse/ACCUMULO-3236 > Project: Accumulo > Issue Type: Improvement > Components: client, tserver > Reporter: John Vines > Fix For: 1.7.0 > > > Currently we have the ability to clone a table, which takes all files belonging to an existing table and then makes them owned by a second, brand new table. I think there is a logic extension to this where you can add the files to an already existing table. > One point of concern is if data is unused in existing files due to major compactions of the shared files in the source table. This can be mitigated by either chopping the files (which sorta goes against the idea of cloning) or ensuring that at source table splits exist in the destination table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)