accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Tubbs (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-3236) Clone table into an existing table
Date Thu, 16 Oct 2014 01:19:33 GMT


Christopher Tubbs commented on ACCUMULO-3236:

bq. I'm avoiding the word copy. ...
I can understand the avoidance of the word "copy", but I don't think "clone" has the right
semantics either. Clone means to produce an identical copy. It does not mean to produce something
that contains an identical copy as a subset.

Perhaps the word "inject" might be better? Or, "insertClone", or "importFrom(table)"?

bq.  I'm avoiding the nuances of special casing of subsets of a table. ...
Sure, but the cloned table will be a subset of the target table... Also, I'm not sure this
is a simpler pass at this than ACCUMULO-571. This implies a destructive merging of two tables
because the target table is irrevocably changed. I think ACCUMULO-571 could be a much simpler
solution, because it could be implemented with very clear semantics (union is pretty well-defined
in this domain), doesn't require dealing with sub-ranges, and leaves both the original tables
in place, while still being as performant as a basic clone.

In the end, I don't care which issue this is achieved under, but I think there's some opportunities
to expose the requirements here (and in ACCUMULO-571) with a very basic primitive union-and-store
operation, which is implemented as a multi-table clone, which can be easily extended. From
what I can tell, that's what you're already describing, except that your target table is also
one of your source tables. Because we already have a regular clone, I guess this is okay (because
they can clone source A1 as A2 before injecting source B into A2). However, the API semantics
for this seem much more munged than a straight-up union-and-store operation.

> Clone table into an existing table
> ----------------------------------
>                 Key: ACCUMULO-3236
>                 URL:
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client, tserver
>            Reporter: John Vines
>             Fix For: 1.7.0
> Currently we have the ability to clone a table, which takes all files belonging to an
existing table and then makes them owned by a second, brand new table. I think there is a
logic extension to this where you can add the files to an already existing table.
> One point of concern is if data is unused in existing files due to major compactions
of the shared files in the source table. This can be mitigated by either chopping the files
(which sorta goes against the idea of cloning) or ensuring that at source table splits exist
in the destination table.

This message was sent by Atlassian JIRA

View raw message