accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From keith-turner <...@git.apache.org>
Subject [GitHub] accumulo pull request #135: ACCUMULO-1787: created TwoTierCompactionStrategy...
Date Tue, 06 Sep 2016 18:02:56 GMT
Github user keith-turner commented on a diff in the pull request:

    https://github.com/apache/accumulo/pull/135#discussion_r77687200
  
    --- Diff: docs/src/main/resources/examples/README.compactionStrategy ---
    @@ -0,0 +1,60 @@
    +Title: Apache Accumulo Customizing the Compaction Strategy 
    +Notice:    Licensed to the Apache Software Foundation (ASF) under one
    +           or more contributor license agreements.  See the NOTICE file
    +           distributed with this work for additional information
    +           regarding copyright ownership.  The ASF licenses this file
    +           to you under the Apache License, Version 2.0 (the
    +           "License"); you may not use this file except in compliance
    +           with the License.  You may obtain a copy of the License at
    +           .
    +             http://www.apache.org/licenses/LICENSE-2.0
    +           .
    +           Unless required by applicable law or agreed to in writing,
    +           software distributed under the License is distributed on an
    +           "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    +           KIND, either express or implied.  See the License for the
    +           specific language governing permissions and limitations
    +           under the License.
    +
    +This tutorial uses the following Java classes, which can be found in org.apache.accumulo.tserver.compaction:

    +
    + * DefaultCompactionStrategy.java - determines which files to compact based on table.compaction.major.ratio
and table.file.max
    + * EverythingCompactionStrategy.java - compacts all files
    + * SizeLimitCompactionStrategy.java - compacts files no bigger than table.majc.compaction.strategy.opts.sizeLimit
    + * TwoTierCompactionStrategy.java - uses default compression for smaller files and table.majc.compaction.strategy.opts.file.large.compress.type
for larger files
    +
    +This is an example of how to configure a compaction strategy. By default Accumulo will
always use the DefaultCompactionStrategy, unless 
    +these steps are taken to change the configuration.  Use the strategy and settings that
best fits your Accumulo setup.
    +
    +The command below sets the compression for smaller files and minor compactions.
    +
    +    $ ./bin/accumulo shell -u root -p secret -e "config -s table.file.compress.type=snappy"
    +
    +The commands below will configure the TwoTierCompactionStrategy to use gz compression
for files larger than 1M. 
    +
    +    $ ./bin/accumulo shell -u root -p secret -e "config -s table.majc.compaction.strategy.opts.file.large.compress.threshold=1M"
    +    $ ./bin/accumulo shell -u root -p secret -e "config -s table.majc.compaction.strategy.opts.file.large.compress.type=gz"
    +    $ ./bin/accumulo shell -u root -p secret -e "config -s table.majc.compaction.strategy=org.apache.accumulo.tserver.compaction.TwoTierCompactionStrategy"
    +
    +Generate some data and files in order to test the strategy:
    +
    +    $ ./bin/accumulo shell -u root -p secret -e "createtable test1"
    +    $ ./bin/accumulo org.apache.accumulo.examples.simple.client.SequentialBatchWriter
-i instance17 -z localhost:2181 -u root -p secret -t test1 --start 0 --num 10000 --size 50
--batchMemory 20M --batchLatency 500 --batchThreads 20
    +    $ ./bin/accumulo shell -u root -p secret -e "flush -t test1"
    +    $ ./bin/accumulo org.apache.accumulo.examples.simple.client.SequentialBatchWriter
-i instance17 -z localhost:2181 -u root -p secret -t test1 --start 0 --num 11000 --size 50
--batchMemory 20M --batchLatency 500 --batchThreads 20
    +    $ ./bin/accumulo shell -u root -p secret -e "flush -t test1"
    +    $ ./bin/accumulo org.apache.accumulo.examples.simple.client.SequentialBatchWriter
-i instance17 -z localhost:2181 -u root -p secret -t test1 --start 0 --num 12000 --size 50
--batchMemory 20M --batchLatency 500 --batchThreads 20
    +    $ ./bin/accumulo shell -u root -p secret -e "flush -t test1"
    +    $ ./bin/accumulo org.apache.accumulo.examples.simple.client.SequentialBatchWriter
-i instance17 -z localhost:2181 -u root -p secret -t test1 --start 0 --num 13000 --size 50
--batchMemory 20M --batchLatency 500 --batchThreads 20
    +    $ ./bin/accumulo shell -u root -p secret -e "flush -t test1"
    +
    +View the tserver log in <accumulo_home>/logs for the compaction and find the name
of the <rfile> that was compacted for your table. Print info about this file using the
PrintInfo tool:
    +
    +    $ ./bin/accumulo org.apache.accumulo.core.file.rfile.PrintInfo <rfile>
    --- End diff --
    
    Can use `accumulo rfile-info` now that the command is fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message