tephra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TEPHRA-35) Prune invalid transaction set once all data for a given invalid transaction has been dropped
Date Wed, 02 Nov 2016 23:34:58 GMT

    [ https://issues.apache.org/jira/browse/TEPHRA-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15630903#comment-15630903

ASF GitHub Bot commented on TEPHRA-35:

GitHub user poornachandra opened a pull request:


    Save compaction state for pruning invalid list

    JIRA - https://issues.apache.org/jira/browse/TEPHRA-35
    Adds ability to save prune upper bound from the transaction snapshot used for compaction.
    Note that the first two commits are re-factoring existing tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/poornachandra/incubator-tephra feature/transaction-pruning

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19
commit 3075cb3cf1b2b52c8946f18e9adec21e8a90d589
Author: poorna <poorna@cask.co>
Date:   2016-10-28T22:12:23Z

    Save compaction state for pruning invalid list

commit be048335024fe03ec567090f0dc2c121d9bff08a
Author: poorna <poorna@cask.co>
Date:   2016-10-29T00:46:01Z

    Refactor existing test

commit 40ab5259722e8e138524c81b90bac2a16d455d24
Author: poorna <poorna@cask.co>
Date:   2016-11-01T03:59:23Z

    Refactor createTable to not add transaction co-processor by default


> Prune invalid transaction set once all data for a given invalid transaction has been
> --------------------------------------------------------------------------------------------
>                 Key: TEPHRA-35
>                 URL: https://issues.apache.org/jira/browse/TEPHRA-35
>             Project: Tephra
>          Issue Type: New Feature
>            Reporter: Gary Helmling
>            Assignee: Poorna Chandra
>            Priority: Blocker
>         Attachments: ApacheTephraAutomaticInvalidListPruning-v2.pdf
> In addition to dropping the data from invalid transactions we need to be able to prune
the invalid set of any transactions where data cleanup has been completely performed. Without
this, the invalid set will grow indefinitely and become a greater and greater cost to in-progress
transactions over time.
> To do this correctly, the TransactionDataJanitor coprocessor will need to maintain some
bookkeeping for the transaction data that it removes, so that the transaction manager can
reason about when all of a given transaction's data has been removed. Only at this point can
the transaction manager safely drop the transaction ID from the invalid set.
> One approach would be for the TransactionDataJanitor to update a table marking when a
major compaction was performed on a region and what transaction IDs were filtered out. Once
all regions in a table containing the transaction data have been compacted, we can remove
the filtered out transaction IDs from the invalid set. However, this will need to cope with
changing region names due to splits, etc.

This message was sent by Atlassian JIRA

View raw message