From commits-return-25065-archive-asf-public=cust-asf.ponee.io@accumulo.apache.org Thu Jul 8 22:23:32 2021 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mxout1-ec2-va.apache.org (mxout1-ec2-va.apache.org [3.227.148.255]) by mx-eu-01.ponee.io (Postfix) with ESMTPS id 8CC11180663 for ; Fri, 9 Jul 2021 00:23:32 +0200 (CEST) Received: from mail.apache.org (mailroute1-lw-us.apache.org [207.244.88.153]) by mxout1-ec2-va.apache.org (ASF Mail Server at mxout1-ec2-va.apache.org) with SMTP id B7EA33FF74 for ; Thu, 8 Jul 2021 22:23:31 +0000 (UTC) Received: (qmail 48833 invoked by uid 500); 8 Jul 2021 22:23:31 -0000 Mailing-List: contact commits-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@accumulo.apache.org Delivered-To: mailing list commits@accumulo.apache.org Received: (qmail 48821 invoked by uid 99); 8 Jul 2021 22:23:31 -0000 Received: from ec2-52-202-80-70.compute-1.amazonaws.com (HELO gitbox.apache.org) (52.202.80.70) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Jul 2021 22:23:31 +0000 Received: by gitbox.apache.org (ASF Mail Server at gitbox.apache.org, from userid 33) id 6475E8DC92; Thu, 8 Jul 2021 22:23:31 +0000 (UTC) Date: Thu, 08 Jul 2021 22:23:31 +0000 To: "commits@accumulo.apache.org" Subject: [accumulo-website] branch asf-staging updated: Automatic Site Publish by Buildbot MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Message-ID: <162578301099.20866.15882549479678692653@gitbox.apache.org> From: git-site-role@apache.org X-Git-Host: gitbox.apache.org X-Git-Repo: accumulo-website X-Git-Refname: refs/heads/asf-staging X-Git-Reftype: branch X-Git-Oldrev: db1693e128eaff60d43ba941d0025461e9d4f873 X-Git-Newrev: 2280e9222773b70642763ddc6acd2963c599d418 X-Git-Rev: 2280e9222773b70642763ddc6acd2963c599d418 X-Git-NotificationType: ref_changed_plus_diff X-Git-Multimail-Version: 1.5.dev Auto-Submitted: auto-generated This is an automated email from the ASF dual-hosted git repository. git-site-role pushed a commit to branch asf-staging in repository https://gitbox.apache.org/repos/asf/accumulo-website.git The following commit(s) were added to refs/heads/asf-staging by this push: new 2280e92 Automatic Site Publish by Buildbot 2280e92 is described below commit 2280e9222773b70642763ddc6acd2963c599d418 Author: buildbot AuthorDate: Thu Jul 8 22:23:24 2021 +0000 Automatic Site Publish by Buildbot --- output/blog/2021/07/08/external-compactions.html | 651 +++++++++++++++++++++ output/feed.xml | 632 +++++++++++++++----- .../202107_ecomp/accumulo-compactor-muchos.yaml | 180 ++++++ output/images/blog/202107_ecomp/ci-entries.png | Bin 0 -> 31996 bytes .../blog/202107_ecomp/ci-files-per-tablet.png | Bin 0 -> 33333 bytes output/images/blog/202107_ecomp/ci-ingest-rate.png | Bin 0 -> 43035 bytes .../images/blog/202107_ecomp/ci-online-tablets.png | Bin 0 -> 29720 bytes .../images/blog/202107_ecomp/ci-pods-running.png | Bin 0 -> 35805 bytes output/images/blog/202107_ecomp/ci-queued.png | Bin 0 -> 45185 bytes output/images/blog/202107_ecomp/ci-running.png | Bin 0 -> 58471 bytes .../images/blog/202107_ecomp/clusters-layout.png | Bin 0 -> 107997 bytes .../images/blog/202107_ecomp/files_over_time.html | 23 + .../202107_ecomp/full-table-compaction-queued.png | Bin 0 -> 35444 bytes .../202107_ecomp/full-table-compaction-running.png | Bin 0 -> 35926 bytes .../tablet-2_1fa2e8ba2e8ba328-files.gif | Bin 0 -> 6501483 bytes .../tablet-2_385d1745d1745d88-files.gif | Bin 0 -> 6535064 bytes .../tablet-2_6a8ba2e8ba2e8c78-files.gif | Bin 0 -> 6467307 bytes .../blog/202107_ecomp/tablet-2_default-files.gif | Bin 0 -> 5817065 bytes output/index.html | 14 +- output/news/index.html | 7 + output/search_data.json | 8 + 21 files changed, 1365 insertions(+), 150 deletions(-) diff --git a/output/blog/2021/07/08/external-compactions.html b/output/blog/2021/07/08/external-compactions.html new file mode 100644 index 0000000..15061f2 --- /dev/null +++ b/output/blog/2021/07/08/external-compactions.html @@ -0,0 +1,651 @@ + + + + + + + + + + + + +External Compactions + + + + + + + + + + + + + +
+
+
+ + +
+ +

External Compactions

+ +

+Author:   Dave Marion, Keith Turner
+Date:   08 Jul 2021
+ +

+ +

External compactions are a new feature in Accumulo 2.1.0 which allows +compaction work to run outside of Tablet Servers.

+ +

Overview

+ +

There are two types of compactions in Accumulo - Minor and Major. Minor +compactions flush recently written data from memory to a new file. Major +compactions merge two or more Tablet files together into one new file. Starting +in 2.1 Tablet Servers can run multiple major compactions for a Tablet +concurrently; there is no longer a single thread pool per Tablet Server that +runs compactions. Major compactions can be resource intensive and may run for a +long time depending on several factors, to include the number and size of the +input files, and the iterators configured to run during major compaction. +Additionally, the Tablet Server does not currently have a mechanism in place to +stop a major compaction that is taking too long or using too many resources. +There is a mechanism to throttle the read and write speed of major compactions +as a way to reduce the resource contention on a Tablet Server where many +concurrent compactions are running. However, throttling compactions on a busy +system will just lead to an increasing amount of queued compactions. Finally, +major compaction work can be wasted in the event of an untimely death of the +Tablet Server or if a Tablet is migrated to another Tablet Server.

+ +

An external compaction is a major compaction that occurs outside of a Tablet +Server. The external compaction feature is an extension of the major compaction +service in the Tablet Server and is configured as part of the systems +compaction service configuration. Thus, it is an optional feature. The goal of +the external compaction feature is to overcome some of the drawbacks of the +Major compactions that happen inside the Tablet Server. Specifically, external +compactions:

+ +
    +
  • Allow major compactions to continue when the originating TabletServer dies
  • +
  • Allow major compactions to occur while a Tablet migrates to a new Tablet Server
  • +
  • Reduce the load on the TabletServer, giving it more cycles to insert mutations and respond to scans (assuming it’s running on different hosts). MapReduce jobs and compactions can lower the effectiveness of processor and page caches for scans, so moving compactions off the host can be beneficial.
  • +
  • Allow major compactions to be scaled differently than the number of TabletServers, giving users more flexibility in allocating resources.
  • +
  • Even out hotspots where a few Tablet Servers have a lot of compaction work. External compactions allow this work to spread much wider than previously possible.
  • +
+ +

The external compaction feature in Apache Accumulo version 2.1.0 adds two new +system-level processes and new configuration properties. The new system-level +processes are the Compactor and the Compaction Coordinator.

+ +
    +
  • The Compactor is a process that is responsible for executing a major compaction. There can be many Compactor’s running on a system. The Compactor communicates with the Compaction Coordinator to get information about the next major compaction it will run and to report the completion state.
  • +
  • The Compaction Coordinator is a single process like the Manager. It is responsible for communicating with the Tablet Servers to gather information about queued external compactions, to reserve a major compaction on the Compactor’s behalf, and to report the completion status of the reserved major compaction. For external compactions that complete when the Tablet is offline, the Compaction Coordinator buffers this information and reports it later.
  • +
+ +

Details

+ +

Before we explain the implementation for external compactions, it’s probably +useful to explain the changes for major compactions that were made in the 2.1.0 +branch before external compactions were added. This is most apparent in the +tserver.compaction.major.service and table.compaction.dispatcher configuration +properties. The simplest way to explain this is that you can now define a +service for executing compactions and then assign that service to a table +(which implies you can have multiple services assigned to different tables). +This gives the flexibility to prevent one table’s compactions from impacting +another table. Each service has named thread pools with size thresholds.

+ +

Configuration

+ +

The configuration below defines a compaction service named cs1 using +the DefaultCompactionPlanner that is configured to have three named thread +pools (small, medium, and large). Each thread pool is configured with a number +of threads to run compactions and a size threshold. If the sum of the input +file sizes is less than 16MB, then the major compaction will be assigned to the +small pool, for example.

+ +
tserver.compaction.major.service.cs1.planner=org.apache.accumulo.core.spi.compaction.DefaultCompactionPlanner
+tserver.compaction.major.service.cs1.planner.opts.executors=[
+{"name":"small","type":"internal","maxSize":"16M","numThreads":8},
+{"name":"medium","type":"internal","maxSize":"128M","numThreads":4},
+{"name":"large","type":"internal","numThreads":2}]
+
+ +

To assign compaction service cs1 to the table ci, you would use the following properties:

+ +
config -t ci -s table.compaction.dispatcher=org.apache.accumulo.core.spi.compaction.SimpleCompactionDispatcher
+config -t ci -s table.compaction.dispatcher.opts.service=cs1
+
+ +

A small modification to the +tserver.compaction.major.service.cs1.planner.opts.executors property in the +example above would enable it to use external compactions. For example, let’s +say that we wanted all of the large compactions to be done externally, you +would use this configuration:

+ +
tserver.compaction.major.service.cs1.planner.opts.executors=[
+{"name":"small","type":"internal","maxSize":"16M","numThreads":8},
+{"name":"medium","type":"internal","maxSize":"128M","numThreads":4},
+{"name":"large","type":"external","queue":"DCQ1"}]'
+
+ +

In this example the queue DCQ1 can be any arbitrary name and allows you to +define multiple pools of Compactor’s.

+ +

Behind these new configurations in 2.1 lies a new algorithm for choosing which +files to compact. This algorithm attempts to find the smallest set of files +that meets the compaction ratio criteria. Prior to 2.1, Accumulo looked for the +largest set of files that met the criteria. Both algorithms do logarithmic +amounts of work. The new algorithm better utilizes multiple thread pools +available for running comactions of different sizes.

+ +

Compactor

+ +

A Compactor is started with the name of the queue for which it will complete +major compactions. You pass in the queue name when starting the Compactor, like +so:

+ +
bin/accumulo compactor -q DCQ1
+
+ +

Once started the Compactor tries to find the location of the +Compaction Coordinator in ZooKeeper and connect to it. Then, it asks the +Compaction Coordinator for the next compaction job for the queue. The +Compaction Coordinator will return to the Compactor the necessary information to +run the major compaction, assuming there is work to be done. Note that the +class performing the major compaction in the Compactor is the same one used in +the Tablet Server, so we are just transferring all of the input parameters from +the Tablet Server to the Compactor. The Compactor communicates information back +to the Compaction Coordinator when the compaction has started, finished +(successfully or not), and during the compaction (progress updates).

+ +

Compaction Coordinator

+ +

The Compaction Coordinator is a singleton process in the system like the +Manager. Also, like the Manager it supports standby Compaction Coordinator’s +using locks in ZooKeeper. The Compaction Coordinator is started using the +command:

+ +
bin/accumulo compaction-coordinator
+
+ +

When running, the Compaction Coordinator polls the TabletServers for summary +information about their external compaction queues. It keeps track of the major +compaction priorities for each Tablet Server and queue. When a Compactor +requests the next major compaction job the Compaction Coordinator finds the +Tablet Server with the highest priority major compaction for that queue and +communicates with that Tablet Server to reserve an external compaction. The +priority in this case is an integer value based on the number of input files +for the compaction. For system compactions, the number is negative starting at +-32768 and increasing to -1 and for user compactions it’s a non-negative number +starting at 0 and limited to 32767. When the Tablet Server reserves the +external compaction an entry is written into the metadata table row for the +Tablet with the address of the Compactor running the compaction and all of the +configuration information passed back from the Tablet Server. Below is an +example of the ecomp metadata column:

+ +
2;10ba2e8ba2e8ba5 ecomp:ECID:94db8374-8275-4f89-ba8b-4c6b3908bc50 []    {"inputs":["hdfs://accucluster/accumulo/tables/2/t-00000ur/A00001y9.rf","hdfs://accucluster/accumulo/tables/2/t-00000ur/C00005lp.rf","hdfs://accucluster/accumulo/tables/2/t-00000ur/F0000dqm.rf","hdfs://accucluster/accumulo/tables/2/t-00000ur/F0000dq1.rf"],"nextFiles":[],"tmp":"hdfs://accucluster/accumulo/tables/2/t-0 [...]
+
+ +

When the Compactor notifies the Compaction Coordinator that it has finished the +major compaction, the Compaction Coordinator attempts to notify the Tablet +Server and inserts an external compaction final state marker into the metadata +table. Below is an example of the final state marker:

+ +
~ecompECID:de6afc1d-64ae-4abf-8bce-02ec0a79aa6c : []        {"extent":{"tableId":"2"},"state":"FINISHED","fileSize":12354,"entries":100000}
+
+ +

If the Compaction Coordinator is able to reach the Tablet Server and that Tablet +Server is still hosting the Tablet, then the compaction is committed and both +of the entries are removed from the metadata table. In the case that the Tablet +is offline when the compaction attempts to commit, there is a thread in the +Compaction Coordinator that looks for completed, but not yet committed, external +compactions and periodically attempts to contact the Tablet Server hosting the +Tablet to commit the compaction. The Compaction Coordinator periodically removes +the final state markers related to Tablets that no longer exist. In the case of +an external compaction failure the Compaction Coordinator notifies the Tablet +and the Tablet cleans up file reservations and removes the metadata entry.

+ +

Edge Cases

+ +

There are several situations involving external compactions that we tested as part of this feature. These are:

+ +
    +
  • Tablet migration
  • +
  • When a user initiated compaction is canceled
  • +
  • What a Table is taken offline
  • +
  • When a Tablet is split or merged
  • +
  • Coordinator restart
  • +
  • Tablet Server death
  • +
  • Table deletion
  • +
+ +

Compactors periodically check if the compaction they are running is related to +a deleted table, split/merged Tablet, or canceled user initiated compaction. If +any of these cases happen the Compactor interrupts the compaction and notifies +the Compaction Coordinator. An external compaction continues in the case of +Tablet Server death, Tablet migration, Coordinator restart, and the Table being +taken offline.

+ +

Cluster Test

+ +

The following tests were run on a cluster to exercise this new feature.

+ +
    +
  1. Run continuous ingest for 24h with large compactions running externally in an autoscaled Kubernetes cluster.
  2. +
  3. After ingest completion, started a full table compaction with all compactions running externally.
  4. +
  5. Run continuous ingest verification process that looks for lost data.
  6. +
+ +

Setup

+ +

For these tests Accumulo, Zookeeper, and HDFS were run on a cluster in Azure +setup by Muchos and external compactions were run in a separate Kubernetes +cluster running in Azure. The Accumulo cluster had the following +configuration.

+ +
    +
  • Centos 7
  • +
  • Open JDK 11
  • +
  • Zookeeper 3.6.2
  • +
  • Hadoop 3.3.0
  • +
  • Accumulo 2.1.0-SNAPSHOT dad7e01
  • +
  • 23 D16s_v4 VMs, each with 16x128G HDDs stripped using LVM. 22 were workers.
  • +
+ +

The following diagram shows how the two clusters were setup. The Muchos and +Kubernetes clusters were on the same private vnet, each with its own /16 subnet +in the 10.x.x.x IP address space. The Kubernetes cluster that ran external +compactions was backed by at least 3 D8s_v4 VMs, with VMs autoscaling with the +number of pods running.

+ +

Cluster Layout

+ +

One problem we ran into was communication between Compactors running inside +Kubernetes with processes like the Compaction Coordinator and DataNodes running +outside of Kubernetes in the Muchos cluster. For some insights into how these +problems were overcome, checkout the comments in the deployment +spec used.

+ +

Configuration

+ +

The following Accumulo shell commands set up a new compaction service named +cs1. This compaction service has an internal executor with 4 threads named +small for compactions less than 32M, an internal executor with 2 threads named +medium for compactions less than 128M, and an external compaction queue named +DCQ1 for all other compactions.

+ +
config -s 'tserver.compaction.major.service.cs1.planner.opts.executors=[{"name":"small","type":"internal","maxSize":"32M","numThreads":4},{"name":"medium","type":"internal","maxSize":"128M","numThreads":2},{"name":"large","type":"external","queue":"DCQ1"}]'
+config -s tserver.compaction.major.service.cs1.planner=org.apache.accumulo.core.spi.compaction.DefaultCompactionPlanner
+
+ +

The continuous ingest table was configured to use the above compaction service. +The table’s compaction ratio was also lowered from the default of 3 to 2. A +lower compaction ratio results in less files per Tablet and more compaction +work.

+ +
config -t ci -s table.compaction.dispatcher=org.apache.accumulo.core.spi.compaction.SimpleCompactionDispatcher
+config -t ci -s table.compaction.dispatcher.opts.service=cs1
+config -t ci -s table.compaction.major.ratio=2
+
+ +

The Compaction Coordinator was manually started on the Muchos VM where the +Accumulo Manager, Zookeeper server, and the Namenode were running. The +following command was used to do this.

+ +
nohup accumulo compaction-coordinator >/var/data/logs/accumulo/compaction-coordinator.out 2>/var/data/logs/accumulo/compaction-coordinator.err &
+
+ +

To start Compactors, Accumulo’s +docker image was +built from the next-release branch by checking out the Apache Accumulo git +repo at commit dad7e01 and building the binary distribution using the +command mvn clean package -DskipTests. The resulting tar file was copied to +the accumulo-docker base directory and the image was built using the command:

+ +
docker build --build-arg ACCUMULO_VERSION=2.1.0-SNAPSHOT --build-arg ACCUMULO_FILE=accumulo-2.1.0-SNAPSHOT-bin.tar.gz \
+             --build-arg HADOOP_FILE=hadoop-3.3.0.tar.gz \
+             --build-arg ZOOKEEPER_VERSION=3.6.2  --build-arg ZOOKEEPER_FILE=apache-zookeeper-3.6.2-bin.tar.gz  \
+             -t accumulo .
+
+ +

The Docker image was tagged and then pushed to a container registry accessible by +Kubernetes. Then the following commands were run to start the Compactors using +accumulo-compactor-muchos.yaml. +The yaml file contains comments explaining issues related to IP addresses and DNS names.

+ +
kubectl apply -f accumulo-compactor-muchos.yaml 
+kubectl autoscale deployment accumulo-compactor --cpu-percent=80 --min=10 --max=660
+
+ +

The autoscale command causes Compactors to scale between 10 +and 660 pods based on CPU usage. When pods average CPU is above 80%, then +pods are added to meet the 80% goal. When it’s below 80%, pods +are stopped to meet the 80% goal with 5 minutes between scale down +events. This can sometimes lead to running compactions being +stopped. During the test there were ~537 dead compactions that were probably +caused by this (there were 44K successful external compactions). The max of 660 +was chosen based on the number of datanodes in the Muchos cluster. There were +22 datanodes and 30x22=660, so this conceptually sets a limit of 30 external +compactions per datanode. This was well tolerated by the Muchos cluster. One +important lesson we learned is that external compactions can strain the HDFS +DataNodes, so it’s important to consider how many concurrent external +compactions will be running. The Muchos cluster had 22x16=352 cores on the +worker VMs, so the max of 660 exceeds what the Muchos cluster could run itself.

+ +

Ingesting data

+ +

After starting Compactors, 22 continuous ingest clients (from +accumulo_testing) were started. The following plot shows the number of +compactions running in the three different compaction queues +configured. The executor cs1_small is for compactions <= 32M and it stayed +pretty busy as minor compactions constantly produce new small files. In 2.1.0 +merging minor compactions were removed, so it’s important to ensure a +compaction queue is properly configured for new small files. The executor +cs1_medium was for compactions >32M and <=128M and it was not as busy, but did +have steady work. The external compaction queue DCQ1 processed all compactions +over 128M and had some spikes of work. These spikes are to be expected with +continuous ingest as all Tablets are written to evenly and eventually all of +the Tablets need to run large compactions around the same time.

+ +

Compactions Running

+ +

The following plot shows the number of pods running in Kubernetes. As +Compactors used more and less CPU the number of pods automatically scaled up +and down.

+ +

Pods Running

+ +

The following plot shows the number of compactions queued. When the +compactions queued for cs1_small spiked above 750, it was adjusted from 4 +threads per Tablet Server to 6 threads. This configuration change was made while +everything was running and the Tablet Servers saw it and reconfigured their thread +pools on the fly.

+ +

Pods Queued

+ +

The metrics emitted by Accumulo for these plots had the following names.

+ +
    +
  • TabletServer1.tserver.compactionExecutors.e_DCQ1_queued
  • +
  • TabletServer1.tserver.compactionExecutors.e_DCQ1_running
  • +
  • TabletServer1.tserver.compactionExecutors.i_cs1_medium_queued
  • +
  • TabletServer1.tserver.compactionExecutors.i_cs1_medium_running
  • +
  • TabletServer1.tserver.compactionExecutors.i_cs1_small_queued
  • +
  • TabletServer1.tserver.compactionExecutors.i_cs1_small_running
  • +
+ +

Tablet servers emit metrics about queued and running compactions for every +compaction executor configured. User can observe these metrics and tune +the configuration based on what they see, as was done in this test.

+ +

The following plot shows the average files per Tablet during the +test. The numbers are what would be expected for a compaction ratio of 2 when +the system is keeping up with compaction work. Also, animated GIFs were created to +show a few tablets files over time.

+ +

Files Per Tablet

+ +

The following is a plot of the number Tablets during the test. +Eventually there were 11.28K Tablets around 512 Tablets per Tablet Server. The +Tablets were close to splitting again at the end of the test as each Tablet was +getting close to 1G.

+ +

Online Tablets

+ +

The following plot shows ingest rate over time. The rate goes down as the +number of Tablets per Tablet Server goes up, this is expected.

+ +

Ingest Rate

+ +

The following plot shows the number of key/values in Accumulo during +the test. When ingest was stopped, there were 266 billion key values in the +continuous ingest table.

+ +

Table Entries

+ +

Full table compaction

+ +

After stopping ingest and letting things settle, a full table compaction was +kicked off. Since all of these compactions would be over 128M, all of them were +scheduled on the external queue DCQ1. The two plots below show compactions +running and queued for the ~2 hours it took to do the compaction. When the +compaction was initiated there were 10 Compactors running in pods. All 11K +Tablets were queued for compaction and because the pods were always running +high CPU Kubernetes kept adding pods until the max was reached resulting in 660 +Compactors running until all the work was done.

+ +

Full Table Compactions Running

+ +

Full Table Compactions Queued

+ +

Verification

+ +

After running everything mentioned above, the continuous ingest verification +map reduce job was run. This job looks for holes in the linked list produced +by continuous ingest which indicate Accumulo lost data. No holes were found. +The counts below were emitted by the job. If there were holes a non-zero +UNDEFINED count would be present.

+ +
        org.apache.accumulo.testing.continuous.ContinuousVerify$Counts
+                REFERENCED=266225036149
+                UNREFERENCED=22010637
+
+ +

Hurdles

+ +

How to Scale Up

+ +

We ran into several issues running the Compactors in Kubernetes. First, we knew +that we could use Kubernetes Horizontal Pod Autoscaler (HPA) to scale the +Compactors up and down based on load. But the question remained how to do that. +Probably the best metric to use for scaling the Compactors is the size of the +external compaction queue. Another possible solution is to take the DataNode +utilization into account somehow. We found that in scaling up the Compactors +based on their CPU usage we could overload DataNodes. Once DataNodes were +overwhelmed, Compactors CPU would drop and the number of pods would naturally +scale down.

+ +

To use custom metrics you would need to get the metrics from Accumulo into a +metrics store that has a metrics adapter. One possible solution, available +in Hadoop 3.3.0, is to use Prometheus, the Prometheus Adapter, and enable +the Hadoop PrometheusMetricsSink added in +HADOOP-16398 to expose the custom queue +size metrics. This seemed like the right solution, but it also seemed like a +lot of work that was outside the scope of this blog post. Ultimately we decided +to take the simplest approach - use the native Kubernetes metrics-server and +scale off CPU usage of the Compactors. As you can see in the “Compactions Queued” +and “Compactions Running” graphs above from the full table compaction, it took about +45 minutes for Kubernetes to scale up Compactors to the maximum configured (660). Compactors +likely would have been scaled up much faster if scaling was done off the queued compactions +instead of CPU usage.

+ +

Gracefully Scaling Down

+ +

The Kubernetes Pod termination process provides a mechanism for the user to +define a pre-stop hook that will be called before the Pod is terminated. +Without this hook Kubernetes sends a SIGTERM to the Pod, followed by a +user-defined grace period, then a SIGKILL. For the purposes of this test we did +not define a pre-stop hook or a grace period. It’s likely possible to handle +this situation more gracefully, but for this test our Compactors were killed +and the compaction work lost when the HPA decided to scale down the Compactors. +It was a good test of how we handled failed Compactors. Investigation is +needed to determine if changes are needed in Accumulo to facilitate graceful +scale down.

+ +

How to Connect

+ +

The other major issue we ran into was connectivity between the Compactors and +the other server processes. The Compactor communicates with ZooKeeper and the +Compaction Coordinator, both of which were running outside of Kubernetes. There +is no common DNS between the Muchos and Kubernetes cluster, but IPs were +visible to both. The Compactor connects to ZooKeeper to find the address of the +Compaction Coordinator so that it can connect to it and look for work. By +default the Accumulo server processes use the hostname as their address which +would not work as those names would not resolve inside the Kubernetes cluster. +We had to start the Accumulo processes using the -a argument and set the +hostname to the IP address. Solving connectivity issues between components +running in Kubernetes and components external to Kubernetes depends on the capabilities +available in the environment and the -a option may be part of the solution.

+ +

Conclusion

+ +

In this blog post we introduced the concept and benefits of external +compactions, the new server processes and how to configure the compaction +service. We deployed a 23-node Accumulo cluster using Muchos with a variable +sized Kubernetes cluster that dynamically scaled Compactors on 3 to 100 compute +nodes from 10 to 660 instances. We ran continuous ingest on the Accumulo +cluster to create compactions that were run both internal and external to the +Tablet Server and demonstrated external compactions completing successfully and +Compactors being killed.

+ +

We discussed also running the following test, but did not have time.

+ +
    +
  • Agitating the Compaction Coordinator, Tablet Servers and Compactors while ingest was running.
  • +
  • Comparing the impact on queries for internal vs external compactions.
  • +
  • Having multiple external compaction queues, each with its own set of autoscaled Compactor pods.
  • +
  • Forcing full table compactions while ingest was running.
  • +
+ +

The test we ran shows that basic functionality works well, it would be nice to +stress the feature in other ways though.

+ + + +

View all posts in the news archive

+ +
+ + +
+ +

+ +

Copyright © 2011-2021 The Apache Software Foundation. +Licensed under the Apache License, Version 2.0.

+ +

Apache®, the names of Apache projects and their logos, and the multicolor feather +logo are registered trademarks or trademarks of The Apache Software Foundation +in the United States and/or other countries.

+ +
+ + +
+
+
+ + diff --git a/output/feed.xml b/output/feed.xml index 0632150..edd6866 100644 --- a/output/feed.xml +++ b/output/feed.xml @@ -6,12 +6,499 @@ https://accumulo.apache.org/ - Wed, 07 Jul 2021 19:01:41 +0000 - Wed, 07 Jul 2021 19:01:41 +0000 + Thu, 08 Jul 2021 22:23:19 +0000 + Thu, 08 Jul 2021 22:23:19 +0000 Jekyll v4.2.0 + External Compactions + <p>External compactions are a new feature in Accumulo 2.1.0 which allows +compaction work to run outside of Tablet Servers.</p> + +<h2 id="overview">Overview</h2> + +<p>There are two types of <a href="https://storage.googleapis.com/pub-tools-public-publication-data/pdf/68a74a85e1662fe02ff3967497f31fda7f32225c.pdf">compactions</a> in Accumulo - Minor and Major. Minor +compactions flush recently written data from memory to a new file. Major +compactions merge two or more Tablet files together into one new file. Starting +in 2.1 Tablet Servers can run multiple major compactions for a Tablet +concurrently; there is no longer a single thread pool per Tablet Server that +runs compactions. Major compactions can be resource intensive and may run for a +long time depending on several factors, to include the number and size of the +input files, and the iterators configured to run during major compaction. +Additionally, the Tablet Server does not currently have a mechanism in place to +stop a major compaction that is taking too long or using too many resources. +There is a mechanism to throttle the read and write speed of major compactions +as a way to reduce the resource contention on a Tablet Server where many +concurrent compactions are running. However, throttling compactions on a busy +system will just lead to an increasing amount of queued compactions. Finally, +major compaction work can be wasted in the event of an untimely death of the +Tablet Server or if a Tablet is migrated to another Tablet Server.</p> + +<p>An external compaction is a major compaction that occurs outside of a Tablet +Server. The external compaction feature is an extension of the major compaction +service in the Tablet Server and is configured as part of the systems +compaction service configuration. Thus, it is an optional feature. The goal of +the external compaction feature is to overcome some of the drawbacks of the +Major compactions that happen inside the Tablet Server. Specifically, external +compactions:</p> + +<ul> + <li>Allow major compactions to continue when the originating TabletServer dies</li> + <li>Allow major compactions to occur while a Tablet migrates to a new Tablet Server</li> + <li>Reduce the load on the TabletServer, giving it more cycles to insert mutations and respond to scans (assuming it’s running on different hosts). MapReduce jobs and compactions can lower the effectiveness of processor and page caches for scans, so moving compactions off the host can be beneficial.</li> + <li>Allow major compactions to be scaled differently than the number of TabletServers, giving users more flexibility in allocating resources.</li> + <li>Even out hotspots where a few Tablet Servers have a lot of compaction work. External compactions allow this work to spread much wider than previously possible.</li> +</ul> + +<p>The external compaction feature in Apache Accumulo version 2.1.0 adds two new +system-level processes and new configuration properties. The new system-level +processes are the Compactor and the Compaction Coordinator.</p> + +<ul> + <li>The Compactor is a process that is responsible for executing a major compaction. There can be many Compactor’s running on a system. The Compactor communicates with the Compaction Coordinator to get information about the next major compaction it will run and to report the completion state.</li> + <li>The Compaction Coordinator is a single process like the Manager. It is responsible for communicating with the Tablet Servers to gather information about queued external compactions, to reserve a major compaction on the Compactor’s behalf, and to report the completion status of the reserved major compaction. For external compactions that complete when the Tablet is offline, the Compaction Coordinator buffers this information and reports it later.</li> +</ul> + +<h2 id="details">Details</h2> + +<p>Before we explain the implementation for external compactions, it’s probably +useful to explain the changes for major compactions that were made in the 2.1.0 +branch before external compactions were added. This is most apparent in the +<code class="language-plaintext highlighter-rouge">tserver.compaction.major.service</code> and <code class="language-plaintext highlighter-rouge">table.compaction.dispatcher</code> configuration +properties. The simplest way to explain this is that you can now define a +service for executing compactions and then assign that service to a table +(which implies you can have multiple services assigned to different tables). +This gives the flexibility to prevent one table’s compactions from impacting +another table. Each service has named thread pools with size thresholds.</p> + +<h3 id="configuration">Configuration</h3> + +<p>The configuration below defines a compaction service named cs1 using +the DefaultCompactionPlanner that is configured to have three named thread +pools (small, medium, and large). Each thread pool is configured with a number +of threads to run compactions and a size threshold. If the sum of the input +file sizes is less than 16MB, then the major compaction will be assigned to the +small pool, for example.</p> + +<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>tserver.compaction.major.service.cs1.planner=org.apache.accumulo.core.spi.compaction.DefaultCompactionPlanner +tserver.compaction.major.service.cs1.planner.opts.executors=[ +{"name":"small","type":"internal","maxSize":"16M","numThreads":8}, +{"name":"medium","type":"internal","maxSize":"128M","numThreads":4}, +{"name":"large","type":"internal","numThreads":2}] +</code></pre></div></div> + +<p>To assign compaction service cs1 to the table ci, you would use the following properties:</p> + +<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>config -t ci -s table.compaction.dispatcher=org.apache.accumulo.core.spi.compaction.SimpleCompactionDispatcher +config -t ci -s table.compaction.dispatcher.opts.service=cs1 +</code></pre></div></div> + +<p>A small modification to the +tserver.compaction.major.service.cs1.planner.opts.executors property in the +example above would enable it to use external compactions. For example, let’s +say that we wanted all of the large compactions to be done externally, you +would use this configuration:</p> + +<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>tserver.compaction.major.service.cs1.planner.opts.executors=[ +{"name":"small","type":"internal","maxSize":"16M","numThreads":8}, +{"name":"medium","type":"internal","maxSize":"128M","numThreads":4}, +{"name":"large","type":"external","queue":"DCQ1"}]' +</code></pre></div></div> + +<p>In this example the queue DCQ1 can be any arbitrary name and allows you to +define multiple pools of Compactor’s.</p> + +<p>Behind these new configurations in 2.1 lies a new algorithm for choosing which +files to compact. This algorithm attempts to find the smallest set of files +that meets the compaction ratio criteria. Prior to 2.1, Accumulo looked for the +largest set of files that met the criteria. Both algorithms do logarithmic +amounts of work. The new algorithm better utilizes multiple thread pools +available for running comactions of different sizes.</p> + +<h3 id="compactor">Compactor</h3> + +<p>A Compactor is started with the name of the queue for which it will complete +major compactions. You pass in the queue name when starting the Compactor, like +so:</p> + +<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bin/accumulo compactor -q DCQ1 +</code></pre></div></div> + +<p>Once started the Compactor tries to find the location of the +Compaction Coordinator in ZooKeeper and connect to it. Then, it asks the +Compaction Coordinator for the next compaction job for the queue. The +Compaction Coordinator will return to the Compactor the necessary information to +run the major compaction, assuming there is work to be done. Note that the +class performing the major compaction in the Compactor is the same one used in +the Tablet Server, so we are just transferring all of the input parameters from +the Tablet Server to the Compactor. The Compactor communicates information back +to the Compaction Coordinator when the compaction has started, finished +(successfully or not), and during the compaction (progress updates).</p> + +<h3 id="compaction-coordinator">Compaction Coordinator</h3> + +<p>The Compaction Coordinator is a singleton process in the system like the +Manager. Also, like the Manager it supports standby Compaction Coordinator’s +using locks in ZooKeeper. The Compaction Coordinator is started using the +command:</p> + +<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bin/accumulo compaction-coordinator +</code></pre></div></div> + +<p>When running, the Compaction Coordinator polls the TabletServers for summary +information about their external compaction queues. It keeps track of the major +compaction priorities for each Tablet Server and queue. When a Compactor +requests the next major compaction job the Compaction Coordinator finds the +Tablet Server with the highest priority major compaction for that queue and +communicates with that Tablet Server to reserve an external compaction. The +priority in this case is an integer value based on the number of input files +for the compaction. For system compactions, the number is negative starting at +-32768 and increasing to -1 and for user compactions it’s a non-negative number +starting at 0 and limited to 32767. When the Tablet Server reserves the +external compaction an entry is written into the metadata table row for the +Tablet with the address of the Compactor running the compaction and all of the +configuration information passed back from the Tablet Server. Below is an +example of the ecomp metadata column:</p> + +<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>2;10ba2e8ba2e8ba5 ecomp:ECID:94db8374-8275-4f89-ba8b-4c6b3908bc50 [] {"inputs":["hdfs://accucluster/accumulo/tables/2/t-00000ur/A00001y9.rf","hdfs://accucluster/accumulo/tables/2/t-00000ur/C00005lp.rf","hdfs://accucluster/accumulo/tables/2/t-00000ur/F0000dqm.rf","hdfs://accucluster/accum [...] +</code></pre></div></div> + +<p>When the Compactor notifies the Compaction Coordinator that it has finished the +major compaction, the Compaction Coordinator attempts to notify the Tablet +Server and inserts an external compaction final state marker into the metadata +table. Below is an example of the final state marker:</p> + +<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>~ecompECID:de6afc1d-64ae-4abf-8bce-02ec0a79aa6c : [] {"extent":{"tableId":"2"},"state":"FINISHED","fileSize":12354,"entries":100000} +</code></pre></div></div> + +<p>If the Compaction Coordinator is able to reach the Tablet Server and that Tablet +Server is still hosting the Tablet, then the compaction is committed and both +of the entries are removed from the metadata table. In the case that the Tablet +is offline when the compaction attempts to commit, there is a thread in the +Compaction Coordinator that looks for completed, but not yet committed, external +compactions and periodically attempts to contact the Tablet Server hosting the +Tablet to commit the compaction. The Compaction Coordinator periodically removes +the final state markers related to Tablets that no longer exist. In the case of +an external compaction failure the Compaction Coordinator notifies the Tablet +and the Tablet cleans up file reservations and removes the metadata entry.</p> + +<h3 id="edge-cases">Edge Cases</h3> + +<p>There are several situations involving external compactions that we tested as part of this feature. These are:</p> + +<ul> + <li>Tablet migration</li> + <li>When a user initiated compaction is canceled</li> + <li>What a Table is taken offline</li> + <li>When a Tablet is split or merged</li> + <li>Coordinator restart</li> + <li>Tablet Server death</li> + <li>Table deletion</li> +</ul> + +<p>Compactors periodically check if the compaction they are running is related to +a deleted table, split/merged Tablet, or canceled user initiated compaction. If +any of these cases happen the Compactor interrupts the compaction and notifies +the Compaction Coordinator. An external compaction continues in the case of +Tablet Server death, Tablet migration, Coordinator restart, and the Table being +taken offline.</p> + +<h2 id="cluster-test">Cluster Test</h2> + +<p>The following tests were run on a cluster to exercise this new feature.</p> + +<ol> + <li>Run continuous ingest for 24h with large compactions running externally in an autoscaled Kubernetes cluster.</li> + <li>After ingest completion, started a full table compaction with all compactions running externally.</li> + <li>Run continuous ingest verification process that looks for lost data.</li> +</ol> + +<h3 id="setup">Setup</h3> + +<p>For these tests Accumulo, Zookeeper, and HDFS were run on a cluster in Azure +setup by Muchos and external compactions were run in a separate Kubernetes +cluster running in Azure. The Accumulo cluster had the following +configuration.</p> + +<ul> + <li>Centos 7</li> + <li>Open JDK 11</li> + <li>Zookeeper 3.6.2</li> + <li>Hadoop 3.3.0</li> + <li>Accumulo 2.1.0-SNAPSHOT <a href="https://github.com/apache/accumulo/commit/dad7e01ae7d450064cba5d60a1e0770311ebdb64">dad7e01</a></li> + <li>23 D16s_v4 VMs, each with 16x128G HDDs stripped using LVM. 22 were workers.</li> +</ul> + +<p>The following diagram shows how the two clusters were setup. The Muchos and +Kubernetes clusters were on the same private vnet, each with its own /16 subnet +in the 10.x.x.x IP address space. The Kubernetes cluster that ran external +compactions was backed by at least 3 D8s_v4 VMs, with VMs autoscaling with the +number of pods running.</p> + +<p><img src="/images/blog/202107_ecomp/clusters-layout.png" alt="Cluster Layout" /></p> + +<p>One problem we ran into was communication between Compactors running inside +Kubernetes with processes like the Compaction Coordinator and DataNodes running +outside of Kubernetes in the Muchos cluster. For some insights into how these +problems were overcome, checkout the comments in the <a href="/images/blog/202107_ecomp/accumulo-compactor-muchos.yaml">deployment +spec</a> used.</p> + +<h3 id="configuration-1">Configuration</h3> + +<p>The following Accumulo shell commands set up a new compaction service named +cs1. This compaction service has an internal executor with 4 threads named +small for compactions less than 32M, an internal executor with 2 threads named +medium for compactions less than 128M, and an external compaction queue named +DCQ1 for all other compactions.</p> + +<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>config -s 'tserver.compaction.major.service.cs1.planner.opts.executors=[{"name":"small","type":"internal","maxSize":"32M","numThreads":4},{"name":"medium","type":"internal","maxSize":"128M","numThreads [...] +config -s tserver.compaction.major.service.cs1.planner=org.apache.accumulo.core.spi.compaction.DefaultCompactionPlanner +</code></pre></div></div> + +<p>The continuous ingest table was configured to use the above compaction service. +The table’s compaction ratio was also lowered from the default of 3 to 2. A +lower compaction ratio results in less files per Tablet and more compaction +work.</p> + +<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>config -t ci -s table.compaction.dispatcher=org.apache.accumulo.core.spi.compaction.SimpleCompactionDispatcher +config -t ci -s table.compaction.dispatcher.opts.service=cs1 +config -t ci -s table.compaction.major.ratio=2 +</code></pre></div></div> + +<p>The Compaction Coordinator was manually started on the Muchos VM where the +Accumulo Manager, Zookeeper server, and the Namenode were running. The +following command was used to do this.</p> + +<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>nohup accumulo compaction-coordinator &gt;/var/data/logs/accumulo/compaction-coordinator.out 2&gt;/var/data/logs/accumulo/compaction-coordinator.err &amp; +</code></pre></div></div> + +<p>To start Compactors, Accumulo’s +<a href="https://github.com/apache/accumulo-docker/tree/next-release">docker</a> image was +built from the <code class="language-plaintext highlighter-rouge">next-release</code> branch by checking out the Apache Accumulo git +repo at commit <a href="https://github.com/apache/accumulo/commit/dad7e01ae7d450064cba5d60a1e0770311ebdb64">dad7e01</a> and building the binary distribution using the +command <code class="language-plaintext highlighter-rouge">mvn clean package -DskipTests</code>. The resulting tar file was copied to +the accumulo-docker base directory and the image was built using the command:</p> + +<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker build --build-arg ACCUMULO_VERSION=2.1.0-SNAPSHOT --build-arg ACCUMULO_FILE=accumulo-2.1.0-SNAPSHOT-bin.tar.gz \ + --build-arg HADOOP_FILE=hadoop-3.3.0.tar.gz \ + --build-arg ZOOKEEPER_VERSION=3.6.2 --build-arg ZOOKEEPER_FILE=apache-zookeeper-3.6.2-bin.tar.gz \ + -t accumulo . +</code></pre></div></div> + +<p>The Docker image was tagged and then pushed to a container registry accessible by +Kubernetes. Then the following commands were run to start the Compactors using +<a href="/images/blog/202107_ecomp/accumulo-compactor-muchos.yaml">accumulo-compactor-muchos.yaml</a>. +The yaml file contains comments explaining issues related to IP addresses and DNS names.</p> + +<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl apply -f accumulo-compactor-muchos.yaml +kubectl autoscale deployment accumulo-compactor --cpu-percent=80 --min=10 --max=660 +</code></pre></div></div> + +<p>The autoscale command causes Compactors to scale between 10 +and 660 pods based on CPU usage. When pods average CPU is above 80%, then +pods are added to meet the 80% goal. When it’s below 80%, pods +are stopped to meet the 80% goal with 5 minutes between scale down +events. This can sometimes lead to running compactions being +stopped. During the test there were ~537 dead compactions that were probably +caused by this (there were 44K successful external compactions). The max of 660 +was chosen based on the number of datanodes in the Muchos cluster. There were +22 datanodes and 30x22=660, so this conceptually sets a limit of 30 external +compactions per datanode. This was well tolerated by the Muchos cluster. One +important lesson we learned is that external compactions can strain the HDFS +DataNodes, so it’s important to consider how many concurrent external +compactions will be running. The Muchos cluster had 22x16=352 cores on the +worker VMs, so the max of 660 exceeds what the Muchos cluster could run itself.</p> + +<h3 id="ingesting-data">Ingesting data</h3> + +<p>After starting Compactors, 22 continuous ingest clients (from +accumulo_testing) were started. The following plot shows the number of +compactions running in the three different compaction queues +configured. The executor cs1_small is for compactions &lt;= 32M and it stayed +pretty busy as minor compactions constantly produce new small files. In 2.1.0 +merging minor compactions were removed, so it’s important to ensure a +compaction queue is properly configured for new small files. The executor +cs1_medium was for compactions &gt;32M and &lt;=128M and it was not as busy, but did +have steady work. The external compaction queue DCQ1 processed all compactions +over 128M and had some spikes of work. These spikes are to be expected with +continuous ingest as all Tablets are written to evenly and eventually all of +the Tablets need to run large compactions around the same time.</p> + +<p><img src="/images/blog/202107_ecomp/ci-running.png" alt="Compactions Running" /></p> + +<p>The following plot shows the number of pods running in Kubernetes. As +Compactors used more and less CPU the number of pods automatically scaled up +and down.</p> + +<p><img src="/images/blog/202107_ecomp/ci-pods-running.png" alt="Pods Running" /></p> + +<p>The following plot shows the number of compactions queued. When the +compactions queued for cs1_small spiked above 750, it was adjusted from 4 +threads per Tablet Server to 6 threads. This configuration change was made while +everything was running and the Tablet Servers saw it and reconfigured their thread +pools on the fly.</p> + +<p><img src="/images/blog/202107_ecomp/ci-queued.png" alt="Pods Queued" /></p> + +<p>The metrics emitted by Accumulo for these plots had the following names.</p> + +<ul> + <li>TabletServer1.tserver.compactionExecutors.e_DCQ1_queued</li> + <li>TabletServer1.tserver.compactionExecutors.e_DCQ1_running</li> + <li>TabletServer1.tserver.compactionExecutors.i_cs1_medium_queued</li> + <li>TabletServer1.tserver.compactionExecutors.i_cs1_medium_running</li> + <li>TabletServer1.tserver.compactionExecutors.i_cs1_small_queued</li> + <li>TabletServer1.tserver.compactionExecutors.i_cs1_small_running</li> +</ul> + +<p>Tablet servers emit metrics about queued and running compactions for every +compaction executor configured. User can observe these metrics and tune +the configuration based on what they see, as was done in this test.</p> + +<p>The following plot shows the average files per Tablet during the +test. The numbers are what would be expected for a compaction ratio of 2 when +the system is keeping up with compaction work. Also, animated GIFs were created to +show a few tablets <a href="/images/blog/202107_ecomp/files_over_time.html">files over time</a>.</p> + +<p><img src="/images/blog/202107_ecomp/ci-files-per-tablet.png" alt="Files Per Tablet" /></p> + +<p>The following is a plot of the number Tablets during the test. +Eventually there were 11.28K Tablets around 512 Tablets per Tablet Server. The +Tablets were close to splitting again at the end of the test as each Tablet was +getting close to 1G.</p> + +<p><img src="/images/blog/202107_ecomp/ci-online-tablets.png" alt="Online Tablets" /></p> + +<p>The following plot shows ingest rate over time. The rate goes down as the +number of Tablets per Tablet Server goes up, this is expected.</p> + +<p><img src="/images/blog/202107_ecomp/ci-ingest-rate.png" alt="Ingest Rate" /></p> + +<p>The following plot shows the number of key/values in Accumulo during +the test. When ingest was stopped, there were 266 billion key values in the +continuous ingest table.</p> + +<p><img src="/images/blog/202107_ecomp/ci-entries.png" alt="Table Entries" /></p> + +<h3 id="full-table-compaction">Full table compaction</h3> + +<p>After stopping ingest and letting things settle, a full table compaction was +kicked off. Since all of these compactions would be over 128M, all of them were +scheduled on the external queue DCQ1. The two plots below show compactions +running and queued for the ~2 hours it took to do the compaction. When the +compaction was initiated there were 10 Compactors running in pods. All 11K +Tablets were queued for compaction and because the pods were always running +high CPU Kubernetes kept adding pods until the max was reached resulting in 660 +Compactors running until all the work was done.</p> + +<p><img src="/images/blog/202107_ecomp/full-table-compaction-queued.png" alt="Full Table Compactions Running" /></p> + +<p><img src="/images/blog/202107_ecomp/full-table-compaction-running.png" alt="Full Table Compactions Queued" /></p> + +<h3 id="verification">Verification</h3> + +<p>After running everything mentioned above, the continuous ingest verification +map reduce job was run. This job looks for holes in the linked list produced +by continuous ingest which indicate Accumulo lost data. No holes were found. +The counts below were emitted by the job. If there were holes a non-zero +UNDEFINED count would be present.</p> + +<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> org.apache.accumulo.testing.continuous.ContinuousVerify$Counts + REFERENCED=266225036149 + UNREFERENCED=22010637 +</code></pre></div></div> + +<h2 id="hurdles">Hurdles</h2> + +<h3 id="how-to-scale-up">How to Scale Up</h3> + +<p>We ran into several issues running the Compactors in Kubernetes. First, we knew +that we could use Kubernetes <a href="https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/">Horizontal Pod Autoscaler</a> (HPA) to scale the +Compactors up and down based on load. But the question remained how to do that. +Probably the best metric to use for scaling the Compactors is the size of the +external compaction queue. Another possible solution is to take the DataNode +utilization into account somehow. We found that in scaling up the Compactors +based on their CPU usage we could overload DataNodes. Once DataNodes were +overwhelmed, Compactors CPU would drop and the number of pods would naturally +scale down.</p> + +<p>To use custom metrics you would need to get the metrics from Accumulo into a +metrics store that has a <a href="https://github.com/kubernetes/metrics/blob/master/IMPLEMENTATIONS.md#custom-metrics-api">metrics adapter</a>. One possible solution, available +in Hadoop 3.3.0, is to use Prometheus, the <a href="https://github.com/kubernetes-sigs/prometheus-adapter">Prometheus Adapter</a>, and enable +the Hadoop PrometheusMetricsSink added in +<a href="https://issues.apache.org/jira/browse/HADOOP-16398">HADOOP-16398</a> to expose the custom queue +size metrics. This seemed like the right solution, but it also seemed like a +lot of work that was outside the scope of this blog post. Ultimately we decided +to take the simplest approach - use the native Kubernetes metrics-server and +scale off CPU usage of the Compactors. As you can see in the “Compactions Queued” +and “Compactions Running” graphs above from the full table compaction, it took about +45 minutes for Kubernetes to scale up Compactors to the maximum configured (660). Compactors +likely would have been scaled up much faster if scaling was done off the queued compactions +instead of CPU usage.</p> + +<h3 id="gracefully-scaling-down">Gracefully Scaling Down</h3> + +<p>The Kubernetes Pod <a href="https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination">termination process</a> provides a mechanism for the user to +define a pre-stop hook that will be called before the Pod is terminated. +Without this hook Kubernetes sends a SIGTERM to the Pod, followed by a +user-defined grace period, then a SIGKILL. For the purposes of this test we did +not define a pre-stop hook or a grace period. It’s likely possible to handle +this situation more gracefully, but for this test our Compactors were killed +and the compaction work lost when the HPA decided to scale down the Compactors. +It was a good test of how we handled failed Compactors. Investigation is +needed to determine if changes are needed in Accumulo to facilitate graceful +scale down.</p> + +<h3 id="how-to-connect">How to Connect</h3> + +<p>The other major issue we ran into was connectivity between the Compactors and +the other server processes. The Compactor communicates with ZooKeeper and the +Compaction Coordinator, both of which were running outside of Kubernetes. There +is no common DNS between the Muchos and Kubernetes cluster, but IPs were +visible to both. The Compactor connects to ZooKeeper to find the address of the +Compaction Coordinator so that it can connect to it and look for work. By +default the Accumulo server processes use the hostname as their address which +would not work as those names would not resolve inside the Kubernetes cluster. +We had to start the Accumulo processes using the <code class="language-plaintext highlighter-rouge">-a</code> argument and set the +hostname to the IP address. Solving connectivity issues between components +running in Kubernetes and components external to Kubernetes depends on the capabilities +available in the environment and the <code class="language-plaintext highlighter-rouge">-a</code> option may be part of the solution.</p> + +<h2 id="conclusion">Conclusion</h2> + +<p>In this blog post we introduced the concept and benefits of external +compactions, the new server processes and how to configure the compaction +service. We deployed a 23-node Accumulo cluster using Muchos with a variable +sized Kubernetes cluster that dynamically scaled Compactors on 3 to 100 compute +nodes from 10 to 660 instances. We ran continuous ingest on the Accumulo +cluster to create compactions that were run both internal and external to the +Tablet Server and demonstrated external compactions completing successfully and +Compactors being killed.</p> + +<p>We discussed also running the following test, but did not have time.</p> + +<ul> + <li>Agitating the Compaction Coordinator, Tablet Servers and Compactors while ingest was running.</li> + <li>Comparing the impact on queries for internal vs external compactions.</li> + <li>Having multiple external compaction queues, each with its own set of autoscaled Compactor pods.</li> + <li>Forcing full table compactions while ingest was running.</li> +</ul> + +<p>The test we ran shows that basic functionality works well, it would be nice to +stress the feature in other ways though.</p> + + + Thu, 08 Jul 2021 00:00:00 +0000 + https://accumulo.apache.org/blog/2021/07/08/external-compactions.html + https://accumulo.apache.org/blog/2021/07/08/external-compactions.html + + + blog + + + + Jshell Accumulo Feature <h2 id="overview">Overview</h2> @@ -1428,146 +1915,5 @@ Accumulo dev list <a href="https://lists.apache.org/thread.html/4ac5b0f6 - - Using S3 as a data store for Accumulo - <p>Accumulo can store its files in S3, however S3 does not support the needs of -write ahead logs and the Accumulo metadata table. One way to solve this problem -is to store the metadata table and write ahead logs in HDFS and everything else -in S3. This post shows how to do that using Accumulo 2.0 and Hadoop 3.2.0. -Running on S3 requires a new feature in Accumulo 2.0, that volume choosers are -aware of write ahead logs.</p> - -<h2 id="hadoop-setup">Hadoop setup</h2> - -<p>At least the following settings should be added to Hadoop’s <code class="language-plaintext highlighter-rouge">core-site.xml</code> file on each node in the cluster.</p> - -<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;property&gt;</span> - <span class="nt">&lt;name&gt;</span>fs.s3a.access.key<span class="nt">&lt;/name&gt;</span> - <span class="nt">&lt;value&gt;</span>KEY<span class="nt">&lt;/value&gt;</span> -<span class="nt">&lt;/property&gt;</span> -<span class="nt">&lt;property&gt;</span> - <span class="nt">&lt;name&gt;</span>fs.s3a.secret.key<span class="nt">&lt;/name&gt;</span> - <span class="nt">&lt;value&gt;</span>SECRET<span class="nt">&lt;/value&gt;</span> -<span class="nt">&lt;/property&gt;</span> -<span class="c">&lt;!-- without this setting Accumulo tservers would have problems when trying to open lots of files --&gt;</span> -<span class="nt">&lt;property&gt;</span> - <span class="nt">&lt;name&gt;</span>fs.s3a.connection.maximum<span class="nt">&lt;/name&gt;</span> - <span class="nt">&lt;value&gt;</span>128<span class="nt">&lt;/value&gt;</span> -<span class="nt">&lt;/property&gt;</span> -</code></pre></div></div> - -<p>See <a href="https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#S3A">S3A docs</a> -for more S3A settings. To get hadoop command to work with s3 set <code class="language-plaintext highlighter-rouge">export -HADOOP_OPTIONAL_TOOLS="hadoop-aws"</code> in <code class="language-plaintext highlighter-rouge">hadoop-env.sh</code>.</p> - -<p>When trying to use Accumulo with Hadoop’s AWS jar <a href="https://issues.apache.org/jira/browse/HADOOP-16080">HADOOP-16080</a> was -encountered. The following instructions build a relocated hadoop-aws jar as a -work around. After building the jar copy it to all nodes in the cluster.</p> - -<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">mkdir</span> <span class="nt">-p</span> /tmp/haws-reloc -<span class="nb">cd</span> /tmp/haws-reloc -<span class="c"># get the Maven pom file that builds a relocated jar</span> -wget https://gist.githubusercontent.com/keith-turner/f6dcbd33342732e42695d66509239983/raw/714cb801eb49084e0ceef5c6eb4027334fd51f87/pom.xml -mvn package <span class="nt">-Dhadoop</span>.version<span class="o">=</span>&lt;your hadoop version&gt; -<span class="c"># the new jar will be in target</span> -<span class="nb">ls </span>target/ -</code></pre></div></div> - -<h2 id="accumulo-setup">Accumulo setup</h2> - -<p>For each node in the cluster, modify <code class="language-plaintext highlighter-rouge">accumulo-env.sh</code> to add S3 jars to the -classpath. Your versions may differ depending on your Hadoop version, -following versions were included with Hadoop 3.2.0.</p> - -<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">CLASSPATH</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">conf</span><span class="k">}</span><span class="s2">:</span&g [...] -<span class="nv">CLASSPATH</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">CLASSPATH</span><span class="k">}</span><span class="s2">:/somedir/hadoop-aws-relocated.3.2.0.jar"</span> -<span class="nv">CLASSPATH</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">CLASSPATH</span><span class="k">}</span><span class="s2">:</span><span class="k">${</span><span class="nv">HADOOP_HOME</span><span class="k">}</sp [...] -<span class="c"># The following are dependencies needed by by the previous jars and are subject to change</span> -<span class="nv">CLASSPATH</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">CLASSPATH</span><span class="k">}</span><span class="s2">:</span><span class="k">${</span><span class="nv">HADOOP_HOME</span><span class="k">}</sp [...] -<span class="nv">CLASSPATH</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">CLASSPATH</span><span class="k">}</span><span class="s2">:</span><span class="k">${</span><span class="nv">HADOOP_HOME</span><span class="k">}</sp [...] -<span class="nv">CLASSPATH</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">CLASSPATH</span><span class="k">}</span><span class="s2">:</span><span class="k">${</span><span class="nv">HADOOP_HOME</span><span class="k">}</sp [...] -<span class="nb">export </span>CLASSPATH -</code></pre></div></div> - -<p>Set the following in <code class="language-plaintext highlighter-rouge">accumulo.properties</code> and then run <code class="language-plaintext highlighter-rouge">accumulo init</code>, but don’t start Accumulo.</p> - -<div class="language-ini highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="py">instance.volumes</span><span class="p">=</span><span class="s">hdfs://&lt;name node&gt;/accumulo</span> -</code></pre></div></div> - -<p>After running Accumulo init we need to configure storing write ahead logs in -HDFS. Set the following in <code class="language-plaintext highlighter-rouge">accumulo.properties</code>.</p> - -<div class="language-ini highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="py">instance.volumes</span><span class="p">=</span><span class="s">hdfs://&lt;name node&gt;/accumulo,s3a://&lt;bucket&gt;/accumulo</span> -<span class="py">general.volume.chooser</span><span class="p">=</span><span class="s">org.apache.accumulo.server.fs.PreferredVolumeChooser</span> -<span class="py">general.custom.volume.preferred.default</span><span class="p">=</span><span class="s">s3a://&lt;bucket&gt;/accumulo</span> -<span class="py">general.custom.volume.preferred.logger</span><span class="p">=</span><span class="s">hdfs://&lt;namenode&gt;/accumulo</span> - -</code></pre></div></div> - -<p>Run <code class="language-plaintext highlighter-rouge">accumulo init --add-volumes</code> to initialize the S3 volume. Doing this -in two steps avoids putting any Accumulo metadata files in S3 during init. -Copy <code class="language-plaintext highlighter-rouge">accumulo.properties</code> to all nodes and start Accumulo.</p> - -<p>Individual tables can be configured to store their files in HDFS by setting the -table property <code class="language-plaintext highlighter-rouge">table.custom.volume.preferred</code>. This should be set for the -metadata table in case it splits using the following Accumulo shell command.</p> - -<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>config -t accumulo.metadata -s table.custom.volume.preferred=hdfs://&lt;namenode&gt;/accumulo -</code></pre></div></div> - -<h2 id="accumulo-example">Accumulo example</h2> - -<p>The following Accumulo shell session shows an example of writing data to S3 and -reading it back. It also shows scanning the metadata table to verify the data -is stored in S3.</p> - -<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>root@muchos&gt; createtable s3test -root@muchos s3test&gt; insert r1 f1 q1 v1 -root@muchos s3test&gt; insert r1 f1 q2 v2 -root@muchos s3test&gt; flush -w -2019-09-10 19:39:04,695 [shell.Shell] INFO : Flush of table s3test completed. -root@muchos s3test&gt; scan -r1 f1:q1 [] v1 -r1 f1:q2 [] v2 -root@muchos s3test&gt; scan -t accumulo.metadata -c file -2&lt; file:s3a://&lt;bucket&gt;/accumulo/tables/2/default_tablet/F000007b.rf [] 234,2 -</code></pre></div></div> - -<p>These instructions were only tested a few times and may not result in a stable -system. I have <a href="https://gist.github.com/keith-turner/149f35f218d10e13227461714012d7bf">run</a> a 24hr test with Accumulo and S3.</p> - -<h2 id="is-s3guard-needed">Is S3Guard needed?</h2> - -<p>I am not completely certain about this, but I don’t think S3Guard is needed for -regular Accumulo tables. There are two reasons I think this is so. First each -Accumulo user tablet stores its list of files in the metadata table using -absolute URIs. This allows a tablet to have files on multiple DFS instances. -Therefore Accumulo never does a DFS list operation to get a tablets files, it -always uses whats in the metadata table. Second, Accumulo gives each file a -unique name using a counter stored in Zookeeper and file names are never -reused.</p> - -<p>Things are sligthly different for Accumulo’s metadata. User tablets store -their file list in the metadata table. Metadata tablets store their file list -in the root table. The root table stores its file list in DFS. Therefore it -would be dangerous to place the root tablet in S3 w/o using S3Guard. That is -why these instructions place Accumulo metadata in HDFS. <strong>Hopefully</strong> this -configuration allows the system to be consistent w/o using S3Guard.</p> - -<p>When Accumulo 2.1.0 is released with the changes made by <a href="https://github.com/apache/accumulo/issues/1313">#1313</a> for issue -<a href="https://github.com/apache/accumulo/issues/936">#936</a>, it may be possible to store the metadata table in S3 w/o -S3Gaurd. If this is the case then only the write ahead logs would need to be -stored in HDFS.</p> - - - Tue, 10 Sep 2019 00:00:00 +0000 - https://accumulo.apache.org/blog/2019/09/10/accumulo-S3-notes.html - https://accumulo.apache.org/blog/2019/09/10/accumulo-S3-notes.html - - - blog - - - diff --git a/output/images/blog/202107_ecomp/accumulo-compactor-muchos.yaml b/output/images/blog/202107_ecomp/accumulo-compactor-muchos.yaml new file mode 100644 index 0000000..fc5b7a2 --- /dev/null +++ b/output/images/blog/202107_ecomp/accumulo-compactor-muchos.yaml @@ -0,0 +1,180 @@ +# The following instance properties are mapped into the containers as +# accumulo.properties and will be picked up by compactor processes. Its +# important that the instance props exactly match those of all other Accumulo +# processes in the cluster in order for them to be able to communicate with each +# other. So if all other processes use ect2-0 as the zookeeper host, we could +# not use the IP addr here and still authenticate with other proccesses because +# the values for the prop would not match exactly. See the comment about +# hostnames further down. +apiVersion: v1 +kind: ConfigMap +metadata: + name: accumulo-properties +data: + accumulo.properties: | + coordinator.port.client=30100 + general.rpc.timeout=240s + instance.secret=muchos + instance.volumes=hdfs://accucluster/accumulo + instance.zookeeper.host=ect2-0:2181 +--- +apiVersion: v1 +kind: ConfigMap +metadata: + name: accumulo-logging +data: + log4j2-service.properties: | + status = info + dest = err + name = AccumuloCompactorLoggingProperties + appender.console.type = Console + appender.console.name = STDOUT + appender.console.target = SYSTEM_OUT + appender.console.layout.type = PatternLayout + appender.console.layout.pattern = %d{ISO8601} [%-8c{2}] %-5p: %m%n + appender.console.filter.threshold.type = ThresholdFilter + appender.console.filter.threshold.level = info + logger.hadoop.name = org.apache.hadoop + logger.hadoop.level = debug + logger.zookeeper.name = org.apache.zookeeper + logger.zookeeper.level = error + logger.accumulo.name = org.apache.accumulo + logger.accumulo.level = debug + rootLogger.level = info + rootLogger.appenderRef.console.ref = STDOUT +--- +# Muchos configured a nameservice of accucluster for HDFS. Urls sent to +# compactors from the coordinator would be of the form hdfs://accucluster/... . +# Therefore compactors need to be able to resolve the accucluster nameservice. +# The following data is mapped into the containers as the hdfs-site.xml file in +# order to support resolving accucluster. +apiVersion: v1 +kind: ConfigMap +metadata: + name: hdfs-site +data: + hdfs-site.xml: | + + + dfs.datanode.synconclose + true + + + dfs.nameservices + accucluster + + + dfs.internal.nameservices + accucluster + + + dfs.namenode.secondary.http-address.accucluster + ect2-0:50090 + + + dfs.namenode.secondary.https-address.accucluster + ect2-0:50091 + + + dfs.namenode.rpc-address.accucluster + ect2-0:8020 + + + dfs.namenode.http-address.accucluster + ect2-0:50070 + + + dfs.namenode.https-address.accucluster + ect2-0:50071 + + + dfs.namenode.servicerpc-address.accucluster + ect2-0:8025 + + + dfs.namenode.lifeline.rpc-address.accucluster + ect2-0:8050 + + + dfs.client.failover.proxy.provider.accucluster + org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider + + +--- +apiVersion: apps/v1 +kind: Deployment +metadata: + name: accumulo-compactor + labels: + app: accumulo-compactor +spec: + replicas: 10 + selector: + matchLabels: + app: accumulo-compactor + template: + metadata: + labels: + app: accumulo-compactor + spec: + # There was no shared DNS setup between the Muchos cluster and Kubernetes + # cluster, however they did share a private vnet. Also Kubernetes was + # configured to give each pod an IP addr on the vnet, making the pods IP + # addrs visible to any process on the Muchos cluster. So the two + # clusters could see each others IP addrs, but did not share a common + # DNS. Compactors need to access Zookeeper and the Coordinator and both + # of these ran on host ect2-0, a hostname defined on the Muchos cluster. + # Therefore the hostname ect2-0 is defined for each pod below so that it + # can resolve this VM running in the Muchos cluster. Given time a + # solution using DNS between the two clusters could probably be figured + # out. This solution would probably be specific to an environment and + # Muchos tries to make very few assumptions. + hostNetwork: false + hostAliases: + - ip: "10.1.0.5" + hostnames: + - "ect2-0" + containers: + - name: accumulo-compactor + image: extcompne.azurecr.io/accumulo + command: ["/bin/bash","-c"] + # Since the Kubernetes and Muchos clusters do not share a common DNS, + # its very important that compactor processes advertise themselves in + # Zookeeper using IP addrs. In the following command -a $(hostname -I) + # accomplishes this. + args: ["accumulo compactor -q DCQ1 -a $(hostname -I)"] + ports: + - containerPort: 9101 + resources: + requests: + cpu: .9 + memory: "2048Mi" + limits: + cpu: 1 + memory: "2048Mi" + env: + - name: HADOOP_USER_NAME + value: centos + - name: ACCUMULO_JAVA_OPTS + value: "-Xmx1g" + volumeMounts: + - name: "config" + mountPath: "/opt/accumulo/conf/accumulo.properties" + subPath: "accumulo.properties" + - name: "logging" + mountPath: "/opt/accumulo/conf/log4j2-service.properties" + subPath: "log4j2-service.properties" + - name: "hdfs" + mountPath: "/opt/hadoop/etc/hadoop/hdfs-site.xml" + subPath: "hdfs-site.xml" + volumes: + - name: "config" + configMap: + name: "accumulo-properties" + - name: "logging" + configMap: + name: "accumulo-logging" + - name: "hdfs" + configMap: + name: "hdfs-site" + diff --git a/output/images/blog/202107_ecomp/ci-entries.png b/output/images/blog/202107_ecomp/ci-entries.png new file mode 100644 index 0000000..da96293 Binary files /dev/null and b/output/images/blog/202107_ecomp/ci-entries.png differ diff --git a/output/images/blog/202107_ecomp/ci-files-per-tablet.png b/output/images/blog/202107_ecomp/ci-files-per-tablet.png new file mode 100644 index 0000000..f831e76 Binary files /dev/null and b/output/images/blog/202107_ecomp/ci-files-per-tablet.png differ diff --git a/output/images/blog/202107_ecomp/ci-ingest-rate.png b/output/images/blog/202107_ecomp/ci-ingest-rate.png new file mode 100644 index 0000000..97ab2ed Binary files /dev/null and b/output/images/blog/202107_ecomp/ci-ingest-rate.png differ diff --git a/output/images/blog/202107_ecomp/ci-online-tablets.png b/output/images/blog/202107_ecomp/ci-online-tablets.png new file mode 100644 index 0000000..8fbf95f Binary files /dev/null and b/output/images/blog/202107_ecomp/ci-online-tablets.png differ diff --git a/output/images/blog/202107_ecomp/ci-pods-running.png b/output/images/blog/202107_ecomp/ci-pods-running.png new file mode 100644 index 0000000..34d4a93 Binary files /dev/null and b/output/images/blog/202107_ecomp/ci-pods-running.png differ diff --git a/output/images/blog/202107_ecomp/ci-queued.png b/output/images/blog/202107_ecomp/ci-queued.png new file mode 100644 index 0000000..349371d Binary files /dev/null and b/output/images/blog/202107_ecomp/ci-queued.png differ diff --git a/output/images/blog/202107_ecomp/ci-running.png b/output/images/blog/202107_ecomp/ci-running.png new file mode 100644 index 0000000..765c5c2 Binary files /dev/null and b/output/images/blog/202107_ecomp/ci-running.png differ diff --git a/output/images/blog/202107_ecomp/clusters-layout.png b/output/images/blog/202107_ecomp/clusters-layout.png new file mode 100644 index 0000000..1ae43e6 Binary files /dev/null and b/output/images/blog/202107_ecomp/clusters-layout.png differ diff --git a/output/images/blog/202107_ecomp/files_over_time.html b/output/images/blog/202107_ecomp/files_over_time.html new file mode 100644 index 0000000..dbaa6d9 --- /dev/null +++ b/output/images/blog/202107_ecomp/files_over_time.html @@ -0,0 +1,23 @@ + + +

Below are animated gifs from four random tablets in the ci table showing +how their files changed during the ingest phase of the test. While the test was +running the ecomp and file columns from the metadata table were dumped to a +file every 60 seconds. The dump file name included a timestamp. This file name +is included in the animated gif. This can help one correlate the animated gifs +with the plots. Blips related for the ecomp column can be seen when external +compactions ran. The animated gifs run at 24 frames per second with each frame +presenting the data captured in a minute, so each second of the gif shows 24 +minutes of captured data. Each gif is approximately 60 seconds long.

+ +

The following show the files over time for tablet with end row 2;1fa2e8ba2e8ba328

+

+

The following show the files over time for tablet with end row 2;385d1745d1745d88

+

+

The following show the files over time for tablet with end row 2;6a8ba2e8ba2e8c78

+

+

The following show the files over time for tablet with end row 2<

+

+ + + diff --git a/output/images/blog/202107_ecomp/full-table-compaction-queued.png b/output/images/blog/202107_ecomp/full-table-compaction-queued.png new file mode 100644 index 0000000..45b83d7 Binary files /dev/null and b/output/images/blog/202107_ecomp/full-table-compaction-queued.png differ diff --git a/output/images/blog/202107_ecomp/full-table-compaction-running.png b/output/images/blog/202107_ecomp/full-table-compaction-running.png new file mode 100644 index 0000000..d5772cd Binary files /dev/null and b/output/images/blog/202107_ecomp/full-table-compaction-running.png differ diff --git a/output/images/blog/202107_ecomp/tablet-2_1fa2e8ba2e8ba328-files.gif b/output/images/blog/202107_ecomp/tablet-2_1fa2e8ba2e8ba328-files.gif new file mode 100644 index 0000000..dcef16d Binary files /dev/null and b/output/images/blog/202107_ecomp/tablet-2_1fa2e8ba2e8ba328-files.gif differ diff --git a/output/images/blog/202107_ecomp/tablet-2_385d1745d1745d88-files.gif b/output/images/blog/202107_ecomp/tablet-2_385d1745d1745d88-files.gif new file mode 100644 index 0000000..765a6c9 Binary files /dev/null and b/output/images/blog/202107_ecomp/tablet-2_385d1745d1745d88-files.gif differ diff --git a/output/images/blog/202107_ecomp/tablet-2_6a8ba2e8ba2e8c78-files.gif b/output/images/blog/202107_ecomp/tablet-2_6a8ba2e8ba2e8c78-files.gif new file mode 100644 index 0000000..2b75d26 Binary files /dev/null and b/output/images/blog/202107_ecomp/tablet-2_6a8ba2e8ba2e8c78-files.gif differ diff --git a/output/images/blog/202107_ecomp/tablet-2_default-files.gif b/output/images/blog/202107_ecomp/tablet-2_default-files.gif new file mode 100644 index 0000000..33ac376 Binary files /dev/null and b/output/images/blog/202107_ecomp/tablet-2_default-files.gif differ diff --git a/output/index.html b/output/index.html index 38942ca..04730a6 100644 --- a/output/index.html +++ b/output/index.html @@ -181,6 +181,13 @@
+ Jul 2021 + External Compactions +
+
+ +
+ @@ -207,13 +214,6 @@
- - diff --git a/output/news/index.html b/output/news/index.html index a7fb684..3b85f98 100644 --- a/output/news/index.html +++ b/output/news/index.html @@ -150,6 +150,13 @@
+
Jul 08
+ +
+ + + + diff --git a/output/search_data.json b/output/search_data.json index 0fbfbaa..a05880a 100644 --- a/output/search_data.json +++ b/output/search_data.json @@ -302,6 +302,14 @@ }, + "blog-2021-07-08-external-compactions-html": { + "title": "External Compactions", + "content" : "External compactions are a new feature in Accumulo 2.1.0 which allowscompaction work to run outside of Tablet Servers.OverviewThere are two types of compactions in Accumulo - Minor and Major. Minorcompactions flush recently written data from memory to a new file. Majorcompactions merge two or more Tablet files together into one new file. Startingin 2.1 Tablet Servers can run multiple major compactions for a Tabletconcurrently; there is no longer a single thread pool p [...] + "url": " /blog/2021/07/08/external-compactions.html", + "categories": "blog" + } + , + "blog-2021-04-21-jshell-accumulo-feature-html": { "title": "Jshell Accumulo Feature", "content" : "OverviewFirst introduced in Java 9, JShell is an interactive Read-Evaluate-Print-Loop (REPL) Java tool that interprets user’s input and outputs the results. This tool provides a convenient way to test out and execute quick tasks with Accumulo in the terminal. This feature is a part of the upcoming Accumulo 2.1 release. If you’re a developer and want to get involved in testing, contact us or review our contributing guide.Major Features Default JShell script provid [...]