accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-1044) bulk imported files showing up in metadata after bulk import fails
Date Tue, 09 Apr 2013 18:42:15 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626929#comment-13626929
] 

Eric Newton commented on ACCUMULO-1044:
---------------------------------------

MetadataBulkLoadFilter is running too quickly.

Bulk loading is a pretty intricate dance of checks:

 * The master distributes the bulk files to tablet servers
 * Tablet servers attempt to determine where the tablets go based on their index information
 * When a tablet incorporates the new file, it has to update the !METADATA table
 * The tablet marks the fact that it has loaded the file
 * When the master completes the bulk import, it removes a transaction id in zookeeper
 * Tablets will not load files once the transaction id is removed
 * The master than asks if anyone is still working on that transaction id
 * Once the master has verified that nobody is doing anything on behalf of the transaction,
it removes the flags that indicate that the file loaded
 * Because splits can occur while the master is removing markers, there's a METADATA filter
to remove them

Here's the problem:
 * the master removes the transaction id
 * the metadata table major compacts, sees the id is missing, so it removes the flags: *this
is bad*
 * the master continues to wait for threads to stop doing work for the transaction
 * the master then sees that there are no references to the tablet and moves it to the failed
directory
                
> bulk imported files showing up in metadata after bulk import fails
> ------------------------------------------------------------------
>
>                 Key: ACCUMULO-1044
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1044
>             Project: Accumulo
>          Issue Type: Bug
>          Components: master, tserver
>    Affects Versions: 1.4.2
>            Reporter: Eric Newton
>            Assignee: Eric Newton
>            Priority: Critical
>             Fix For: 1.6.0
>
>
> Bulk import fails.  The file is moved to the failures directory.
> But references in the !METADATA table remain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message