accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Tubbs (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-3236) Clone table into an existing table
Date Sun, 19 Oct 2014 06:44:33 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-3236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14176231#comment-14176231
] 

Christopher Tubbs commented on ACCUMULO-3236:
---------------------------------------------

bq. Has this discussion moved from "how should we do this" to "what should we call it?"

I think so.

bq. I do think "clone" is a composition of the snapshot'ing functionality, table creation,
import, ...

I think "snapshot" implies some explicit view at a point in time of a table. Because of the
way clone is implemented, I would probably describe the relationship between "clone" and "snapshot"
somewhat the other way around. Clone is not so much a composition of a snapshot functionality.
Rather, snapshotting can be achieved using clone, along with ensuring the table is offline
(or at least, ingest is halted and the table is completely flushed). Clone by itself, can
result in some very unexpected behavior if you assume it represents a snapshot in time of
a table. For instance, a clone can pick up new data that was bulk imported to a tablet, but
not older data that is waiting to be flushed... so even within a tablet, you can get gaps
in time from clone, never mind the expectations for a snapshot view for an entire table.

> Clone table into an existing table
> ----------------------------------
>
>                 Key: ACCUMULO-3236
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3236
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client, tserver
>            Reporter: John Vines
>             Fix For: 1.7.0
>
>
> Currently we have the ability to clone a table, which takes all files belonging to an
existing table and then makes them owned by a second, brand new table. I think there is a
logic extension to this where you can add the files to an already existing table.
> One point of concern is if data is unused in existing files due to major compactions
of the shared files in the source table. This can be mitigated by either chopping the files
(which sorta goes against the idea of cloning) or ensuring that at source table splits exist
in the destination table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message