accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Vines (JIRA)" <j...@apache.org>
Subject [jira] [Created] (ACCUMULO-571) MergeClone/BulkImport from existing table
Date Thu, 03 May 2012 15:02:49 GMT
John Vines created ACCUMULO-571:
-----------------------------------

             Summary: MergeClone/BulkImport from existing table
                 Key: ACCUMULO-571
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-571
             Project: Accumulo
          Issue Type: New Feature
          Components: client, tserver
            Reporter: John Vines
            Assignee: Billie Rinaldi


This is idea that was recently brought to my attention. The use case is a user wants to essentially
clone a subset of a table into an existing table. Currently cloning does not allow this. Current
option is to copy the files in hdfs and then bulk import, since bulk import moves the files.
This is pretty wasteful. Under the hood, the system can handle the cross-linking between files
like that. We just need a mechanism to provide the ability to assign a subset of data to another
region.

Potential uses include the above mentioned, as well as the potential for users to bring fresh
data into a table which was cloned and modified. There may be other cases, but I haven't fully
thought out this problem space.

The biggest problem with this is it does put the onus on the user for ensuring that data in
the in memory maps is flushed before moving, as well as for handling the possibility of duplicate
data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message