tephra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From anew <...@git.apache.org>
Subject [GitHub] incubator-tephra pull request #20: Compute global prune upper bound using co...
Date Sat, 03 Dec 2016 15:17:37 GMT
Github user anew commented on a diff in the pull request:

    https://github.com/apache/incubator-tephra/pull/20#discussion_r90759085
  
    --- Diff: tephra-core/src/main/java/org/apache/tephra/janitor/TransactionPruningPlugin.java
---
    @@ -0,0 +1,90 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +package org.apache.tephra.janitor;
    +
    +import org.apache.hadoop.conf.Configuration;
    +
    +import java.io.IOException;
    +
    +/**
    + * Data janitor interface to manage the invalid transaction list.
    + *
    + * <p/>
    + * An invalid transaction can only be removed from the invalid list after the data written
    + * by the invalid transactions has been removed from all the data stores.
    + * The term data store is used here to represent a set of tables in a database that have
    + * the same data clean up policy, like all Apache Phoenix tables in an HBase instance.
    + *
    + * <p/>
    + * Typically every data store will have a background job which cleans up the data written
by invalid transactions.
    + * Prune upper bound for a data store is defined as the largest invalid transaction whose
data has been
    + * cleaned up from that data store.
    + * <pre>
    + * prune-upper-bound = min(max(invalid list), min(in-progress list) - 1)
    + * </pre>
    + * where invalid list and in-progress list are from the transaction snapshot used to
clean up the invalid data in the
    + * data store.
    + *
    + * <p/>
    + * There will be one such plugin per data store. The plugins will be executed as part
of the Transaction Service.
    + * Each plugin will be invoked periodically to fetch the prune upper bound for its data
store.
    + * Invalid transaction list can pruned up to the minimum of prune upper bounds returned
by all the plugins.
    + */
    +public interface TransactionPruningPlugin {
    +  /**
    +   * Called once when the Transaction Service starts up.
    +   *
    +   * @param conf configuration for the plugin
    +   */
    +  void initialize(Configuration conf) throws IOException;
    +
    +  /**
    +   * Called periodically to fetch prune upper bound for a data store. The plugin examines
the state of data cleanup
    +   * in the data store and determines the smallest invalid transaction whose writes no
longer exist in the data
    --- End diff --
    
    or a greatest lower bound for transaction ids that may not be pruned?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message