Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm
Precedence: bulk
Reply-To: jira@apache.org
Date: Sun, 19 Oct 2014 06:44:33 +0000 (UTC)
From: "Christopher Tubbs (JIRA)" <jira@apache.org>
To: notifications@accumulo.apache.org
Message-ID: <JIRA.12748384.1413400650000.292478.1413701073977@Atlassian.JIRA>
In-Reply-To: <JIRA.12748384.1413400650000@Atlassian.JIRA>
References: <JIRA.12748384.1413400650000@Atlassian.JIRA>
 <JIRA.12748384.1413400650044@arcas>
Subject: [jira] [Commented] (ACCUMULO-3236) Clone table into an existing
 table
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/ACCUMULO-3236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14176231#comment-14176231 ] 

Christopher Tubbs commented on ACCUMULO-3236:
---------------------------------------------

bq. Has this discussion moved from "how should we do this" to "what should we call it?"

I think so.

bq. I do think "clone" is a composition of the snapshot'ing functionality, table creation, import, ...

I think "snapshot" implies some explicit view at a point in time of a table. Because of the way clone is implemented, I would probably describe the relationship between "clone" and "snapshot" somewhat the other way around. Clone is not so much a composition of a snapshot functionality. Rather, snapshotting can be achieved using clone, along with ensuring the table is offline (or at least, ingest is halted and the table is completely flushed). Clone by itself, can result in some very unexpected behavior if you assume it represents a snapshot in time of a table. For instance, a clone can pick up new data that was bulk imported to a tablet, but not older data that is waiting to be flushed... so even within a tablet, you can get gaps in time from clone, never mind the expectations for a snapshot view for an entire table.

> Clone table into an existing table
> ----------------------------------
>
>                 Key: ACCUMULO-3236
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3236
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client, tserver
>            Reporter: John Vines
>             Fix For: 1.7.0
>
>
> Currently we have the ability to clone a table, which takes all files belonging to an existing table and then makes them owned by a second, brand new table. I think there is a logic extension to this where you can add the files to an already existing table.
> One point of concern is if data is unused in existing files due to major compactions of the shared files in the source table. This can be mitigated by either chopping the files (which sorta goes against the idea of cloning) or ensuring that at source table splits exist in the destination table.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)