lucene-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From a...@apache.org
Subject [lucene-solr] 02/04: SOLR-14615: Implement CPU Utilization Based Circuit Breaker (#1737)
Date Fri, 16 Oct 2020 09:56:24 GMT
This is an automated email from the ASF dual-hosted git repository.

atri pushed a commit to branch branch_8x
in repository https://gitbox.apache.org/repos/asf/lucene-solr.git

commit c841f63e4ba8615d648f00741d4ee3628cd60954
Author: Atri Sharma <atri.jiit@gmail.com>
AuthorDate: Thu Aug 20 13:21:26 2020 +0530

    SOLR-14615: Implement CPU Utilization Based Circuit Breaker (#1737)
    
    This commit introduces CPU based circuit breaker. This circuit breaker
    tracks the average CPU load per minute and triggers if the value exceeds
    a configurable value.
    
    This commit also adds a specific control flag for Memory Circuit Breaker
    to allow enabling/disabling the same.
---
 solr/CHANGES.txt                                   | 101 ++++++++++++++++++
 .../src/java/org/apache/solr/core/SolrConfig.java  |  27 +++--
 .../util/circuitbreaker/CPUCircuitBreaker.java     | 116 +++++++++++++++++++++
 .../solr/util/circuitbreaker/CircuitBreaker.java   |   8 ++
 .../util/circuitbreaker/CircuitBreakerManager.java |  13 ++-
 .../util/circuitbreaker/MemoryCircuitBreaker.java  |  20 +++-
 .../resources/EditableSolrConfigAttributes.json    |  13 ++-
 .../conf/solrconfig-memory-circuitbreaker.xml      |  11 +-
 .../test/org/apache/solr/core/SolrCoreTest.java    |   3 +-
 .../org/apache/solr/util/TestCircuitBreaker.java   | 104 +++++++++++++++++-
 .../solr/configsets/_default/conf/solrconfig.xml   |  40 ++++---
 solr/solr-ref-guide/src/circuit-breakers.adoc      |  38 +++++--
 12 files changed, 447 insertions(+), 47 deletions(-)

diff --git a/solr/CHANGES.txt b/solr/CHANGES.txt
index 5f34199..6898b5d 100644
--- a/solr/CHANGES.txt
+++ b/solr/CHANGES.txt
@@ -29,6 +29,107 @@ Improvements
 ----------------------
 * LUCENE-8984: MoreLikeThis MLT is biased for uncommon fields (Andy Hind via Anshum Gupta)
 
+* SOLR-14223: PKI Auth can bootstrap from existing key files instead of creating new keys
on startup (Mike Drob)
+
+* SOLR-11725: Use corrected sample formula for computing stdDev and variance in JSON aggregations
+  (hossman, Munendra S N, yonik)
+
+* SOLR-14387: SolrClient.getById() will escape comma separater within ids (Markus Schuch
via Mike Drob)
+
+* SOLR-10814: Add short-name feature to RuleBasedAuthz plugin (Mike Drob, Hrishikesh Gadre)
+
+* SOLR-7683 Introduce support to identify Solr internal request types (Atri Sharma, Hrishikesh
Gadre)
+
+* SOLR-13528 Rate Limiting in Solr (Atri Sharma, Mike Drob)
+
+* SOLR-14615: CPU Utilization Based Circuit Breaker (Atri Sharma)
+
+Other Changes
+----------------------
+* SOLR-14656: Autoscaling framework removed (Ishan Chattopadhyaya, noble, Ilan Ginzburg)
+
+* LUCENE-9391: Upgrade HPPC to 0.8.2. (Haoyu Zhai)
+
+* SOLR-10288: Remove non-minified JavaScript from the webapp. (Erik Hatcher, marcussorealheis)
+
+* SOLR-13655:Upgrade Collections.unModifiableSet to Set.of and Set.copyOf (Atri Sharma via
Tomás Fernández Löbbe)
+
+* SOLR-13797: SolrResourceLoader no longer caches bad results when asked for wrong type (Mike
Drob)
+
+* LUCENE-9092: Upgrade Carrot2 to 3.16.2 (Dawid Weiss).
+
+* LUCENE-9080: Upgrade ICU4j to 62.2 and make regenerate work (Erick Erickson)
+
+* SOLR-14271: Remove duplicate async id check meant for pre Solr 8 versions (Anshum Gupta)
+
+* SOLR-14272: Remove autoReplicaFailoverBadNodeExpiration and autoReplicaFailoverWorkLoopDelay
for 9.0 as it was
+  deprecated in 7.1 (Anshum Gupta)
+
+* SOLR-14258: DocList no longer extends DocSet. (David Smiley)
+
+* SOLR-14256: Remove HashDocSet; add DocSet.getBits() instead.  DocSet is now strictly immutable
and ascending order.
+  It's now locked-down to external extension; only 2 impls exist.  (David Smiley)
+
+* SOLR-14197: SolrResourceLoader: remove deprecated methods and do other improvements. (David
Smiley)
+
+* SOLR-14012: Return long value for unique and hll aggregations irrespective of shard count
(Munendra S N, hossman)
+
+* SOLR-14322: AbstractFullDistribZkTestBase.waitForRecoveriesToFinish now takes a timeout
and time unit instead of
+  assuming that we are passed value in seconds. (Mike Drob)
+
+* SOLR-13893: Remove support to read BlobRepository's max jar size from deprecated `runtme.lib.size`
system property
+  (Erick Erickson, Kesharee Nandan Vishwakarma, Munendra S N)
+
+* SOLR-12720: Remove support for `autoReplicaFailoverWaitAfterExpiration`. (marcussorealheis,
shalin)
+
+* SOLR-9909: The deprecated SolrjNamedThreadFactory has been removed. Use SolrNamedThreadFactory
instead.
+  (Andras Salamon, shalin)
+
+* SOLR-14420: AuthenticationPlugin.authenticate accepts HttpServletRequest instead of ServletRequest.
(Mike Drob)
+
+* SOLR-14429: Convert .txt files to properly formatted .md files. (Tomoko Uchida, Uwe Schindler)
+
+* SOLR-14412: Automatically set urlScheme to https when running secure solr with embedded
zookeeper. (Mike Drob)
+  Do not erroneously set solr.jetty.https.port system property when running in http mode
(Upendra Penegalapati)
+
+* SOLR-14014: Introducing a system property that allows users to disable the Admin UI, which
is enabled by default.
+  If you have security concerns or other reasons to disable the Admin UI, you can modify
`SOLR_ADMIN_UI_DISABLED`
+  `solr.in.sh`/`solr.in.cmd` at start. (marcussorealheis)
+
+* SOLR-14486: Autoscaling simulation framework no longer creates /clusterstate.json (format
1),
+  instead it creates individual per-collection /state.json files (format 2). (ab)
+
+* SOLR-12823: Remove /clusterstate.json support, including support for collections created
with stateFormat=1,
+  as well as support for Collection API MIGRATESTATEFORMAT action and support for the legacyCloud
flag (Ilan Ginzburg).
+
+* LUCENE-9411: Fail complation on warnings, 9x gradle-only (Erick Erickson, Dawid Weiss)
+  Deserves mention here as well as Lucene CHANGES.txt since it affects both.
+
+* SOLR-12847: Remove support for maxShardsPerNode. (ab)
+
+* SOLR-14244: Remove ReplicaInfo. (ab)
+
+* SOLR-14654: Remove plugin loading from .system collection (for 9.0) (noble)
+
+* SOLR-14702: All references to "master" and "slave" replaced with "leader" and "follower"
(MarcusSorealheis, 
+  Erick Erickson, Tomás Fernández Löbbe)
+
+Bug Fixes
+---------------------
+* SOLR-14546: Fix for a relatively hard to hit issue in OverseerTaskProcessor that could
lead to out of order execution
+  of Collection API tasks competing for a lock (Ilan Ginzburg).
+
+==================  8.7.0 ==================
+
+Consult the lucene/CHANGES.txt file for additional, low level, changes in this release.
+
+New Features
+---------------------
+
+* SOLR-14151 Make schema components load from packages (noble)
+
+* SOLR-14615: Implement CPU Utilization Based Circuit Breaker (#1737)
+
 * SOLR-14681: Introduce ability to delete .jar stored in the Package Store. (MarcusSorealheis
, Mike Drob)
 
 * SOLR-14604: Add the ability to uninstall a package from with the Package CLI. (MarcusSorealheis)
diff --git a/solr/core/src/java/org/apache/solr/core/SolrConfig.java b/solr/core/src/java/org/apache/solr/core/SolrConfig.java
index ec81688..707d770 100644
--- a/solr/core/src/java/org/apache/solr/core/SolrConfig.java
+++ b/solr/core/src/java/org/apache/solr/core/SolrConfig.java
@@ -228,10 +228,13 @@ public class SolrConfig extends XmlConfigFile implements MapSerializable
{
     queryResultMaxDocsCached = getInt("query/queryResultMaxDocsCached", Integer.MAX_VALUE);
     enableLazyFieldLoading = getBool("query/enableLazyFieldLoading", false);
 
-    useCircuitBreakers = getBool("circuitBreaker/useCircuitBreakers", false);
-    memoryCircuitBreakerThresholdPct = getInt("circuitBreaker/memoryCircuitBreakerThresholdPct",
95);
+    useCircuitBreakers = getBool("circuitBreakers/@enabled", false);
+    cpuCBEnabled = getBool("circuitBreakers/cpuBreaker/@enabled", false);
+    memCBEnabled = getBool("circuitBreakers/memBreaker/@enabled", false);
+    memCBThreshold = getInt("circuitBreakers/memBreaker/@threshold", 95);
+    cpuCBThreshold = getInt("circuitBreakers/cpuBreaker/@threshold", 95);
 
-    validateMemoryBreakerThreshold();
+    validateCircuitBreakerThresholds();
     
     filterCacheConfig = CacheConfig.getConfig(this, "query/filterCache");
     queryResultCacheConfig = CacheConfig.getConfig(this, "query/queryResultCache");
@@ -530,7 +533,10 @@ public class SolrConfig extends XmlConfigFile implements MapSerializable
{
   public final boolean enableLazyFieldLoading;
   // Circuit Breaker Configuration
   public final boolean useCircuitBreakers;
-  public final int memoryCircuitBreakerThresholdPct;
+  public final int memCBThreshold;
+  public final boolean memCBEnabled;
+  public final boolean cpuCBEnabled;
+  public final int cpuCBThreshold;
 
   // IndexConfig settings
   public final SolrIndexConfig indexConfig;
@@ -811,10 +817,12 @@ public class SolrConfig extends XmlConfigFile implements MapSerializable
{
     loader.reloadLuceneSPI();
   }
 
-  private void validateMemoryBreakerThreshold() {
+  private void validateCircuitBreakerThresholds() {
     if (useCircuitBreakers) {
-      if (memoryCircuitBreakerThresholdPct > 95 || memoryCircuitBreakerThresholdPct <
50) {
-        throw new IllegalArgumentException("Valid value range of memoryCircuitBreakerThresholdPct
is 50 -  95");
+      if (memCBEnabled) {
+        if (memCBThreshold > 95 || memCBThreshold < 50) {
+          throw new IllegalArgumentException("Valid value range of memoryCircuitBreakerThresholdPct
is 50 -  95");
+        }
       }
     }
   }
@@ -889,7 +897,10 @@ public class SolrConfig extends XmlConfigFile implements MapSerializable
{
     m.put("enableLazyFieldLoading", enableLazyFieldLoading);
     m.put("maxBooleanClauses", booleanQueryMaxClauseCount);
     m.put("useCircuitBreakers", useCircuitBreakers);
-    m.put("memoryCircuitBreakerThresholdPct", memoryCircuitBreakerThresholdPct);
+    m.put("cpuCircuitBreakerEnabled", cpuCBEnabled);
+    m.put("memoryCircuitBreakerEnabled", memCBEnabled);
+    m.put("memoryCircuitBreakerThresholdPct", memCBThreshold);
+    m.put("cpuCircuitBreakerThreshold", cpuCBThreshold);
     for (SolrPluginInfo plugin : plugins) {
       List<PluginInfo> infos = getPluginInfos(plugin.clazz.getName());
       if (infos == null || infos.isEmpty()) continue;
diff --git a/solr/core/src/java/org/apache/solr/util/circuitbreaker/CPUCircuitBreaker.java
b/solr/core/src/java/org/apache/solr/util/circuitbreaker/CPUCircuitBreaker.java
new file mode 100644
index 0000000..45dc1e8
--- /dev/null
+++ b/solr/core/src/java/org/apache/solr/util/circuitbreaker/CPUCircuitBreaker.java
@@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.util.circuitbreaker;
+
+import java.lang.invoke.MethodHandles;
+import java.lang.management.ManagementFactory;
+import java.lang.management.OperatingSystemMXBean;
+
+import org.apache.solr.core.SolrConfig;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * <p>
+ * Tracks current CPU usage and triggers if the specified threshold is breached.
+ *
+ * This circuit breaker gets the average CPU load over the last minute and uses
+ * that data to take a decision. We depend on OperatingSystemMXBean which does
+ * not allow a configurable interval of collection of data.
+ * //TODO: Use Codahale Meter to calculate the value locally.
+ * </p>
+ *
+ * <p>
+ * The configuration to define which mode to use and the trigger threshold are defined in
+ * solrconfig.xml
+ * </p>
+ */
+public class CPUCircuitBreaker extends CircuitBreaker {
+  private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+  private static final OperatingSystemMXBean operatingSystemMXBean = ManagementFactory.getOperatingSystemMXBean();
+
+  private final boolean enabled;
+  private final double cpuUsageThreshold;
+
+  // Assumption -- the value of these parameters will be set correctly before invoking getDebugInfo()
+  private static final ThreadLocal<Double> seenCPUUsage = ThreadLocal.withInitial(()
-> 0.0);
+
+  private static final ThreadLocal<Double> allowedCPUUsage = ThreadLocal.withInitial(()
-> 0.0);
+
+  public CPUCircuitBreaker(SolrConfig solrConfig) {
+    super(solrConfig);
+
+    this.enabled = solrConfig.cpuCBEnabled;
+    this.cpuUsageThreshold = solrConfig.cpuCBThreshold;
+  }
+
+  @Override
+  public boolean isTripped() {
+    if (!isEnabled()) {
+      return false;
+    }
+
+    if (!enabled) {
+      return false;
+    }
+
+    double localAllowedCPUUsage = getCpuUsageThreshold();
+    double localSeenCPUUsage = calculateLiveCPUUsage();
+
+    if (localSeenCPUUsage < 0) {
+      if (log.isWarnEnabled()) {
+        String msg = "Unable to get CPU usage";
+
+        log.warn(msg);
+      }
+
+      return false;
+    }
+
+    allowedCPUUsage.set(localAllowedCPUUsage);
+
+    seenCPUUsage.set(localSeenCPUUsage);
+
+    return (localSeenCPUUsage >= localAllowedCPUUsage);
+  }
+
+  @Override
+  public String getDebugInfo() {
+
+    if (seenCPUUsage.get() == 0.0 || seenCPUUsage.get() == 0.0) {
+      log.warn("CPUCircuitBreaker's monitored values (seenCPUUSage, allowedCPUUsage) not
set");
+    }
+
+    return "seenCPUUSage=" + seenCPUUsage.get() + " allowedCPUUsage=" + allowedCPUUsage.get();
+  }
+
+  @Override
+  public String getErrorMessage() {
+    return "CPU Circuit Breaker triggered as seen CPU usage is above allowed threshold."
+
+        "Seen CPU usage " + seenCPUUsage.get() + " and allocated threshold " +
+        allowedCPUUsage.get();
+  }
+
+  public double getCpuUsageThreshold() {
+    return cpuUsageThreshold;
+  }
+
+  protected double calculateLiveCPUUsage() {
+    return operatingSystemMXBean.getSystemLoadAverage();
+  }
+}
diff --git a/solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreaker.java b/solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreaker.java
index f56f81e..63ad7fd 100644
--- a/solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreaker.java
+++ b/solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreaker.java
@@ -27,6 +27,9 @@ import org.apache.solr.core.SolrConfig;
  *  2. Use the circuit breaker in a specific code path(s).
  *
  * TODO: This class should be grown as the scope of circuit breakers grow.
+ *
+ * The class and its derivatives raise a standard exception when a circuit breaker is triggered.
+ * We should make it into a dedicated exception (https://issues.apache.org/jira/browse/SOLR-14755)
  * </p>
  */
 public abstract class CircuitBreaker {
@@ -53,4 +56,9 @@ public abstract class CircuitBreaker {
    * Get debug useful info.
    */
   public abstract String getDebugInfo();
+
+  /**
+   * Get error message when the circuit breaker triggers
+   */
+  public abstract String getErrorMessage();
 }
\ No newline at end of file
diff --git a/solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreakerManager.java
b/solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreakerManager.java
index 584b933..ed7f62d 100644
--- a/solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreakerManager.java
+++ b/solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreakerManager.java
@@ -20,6 +20,7 @@ package org.apache.solr.util.circuitbreaker;
 import java.util.ArrayList;
 import java.util.List;
 
+import com.google.common.annotations.VisibleForTesting;
 import org.apache.solr.core.SolrConfig;
 
 /**
@@ -107,9 +108,7 @@ public class CircuitBreakerManager {
     StringBuilder sb = new StringBuilder();
 
     for (CircuitBreaker circuitBreaker : circuitBreakerList) {
-      sb.append(circuitBreaker.getClass().getName());
-      sb.append(" ");
-      sb.append(circuitBreaker.getDebugInfo());
+      sb.append(circuitBreaker.getErrorMessage());
       sb.append("\n");
     }
 
@@ -127,8 +126,16 @@ public class CircuitBreakerManager {
 
     // Install the default circuit breakers
     CircuitBreaker memoryCircuitBreaker = new MemoryCircuitBreaker(solrConfig);
+    CircuitBreaker cpuCircuitBreaker = new CPUCircuitBreaker(solrConfig);
+
     circuitBreakerManager.register(memoryCircuitBreaker);
+    circuitBreakerManager.register(cpuCircuitBreaker);
 
     return circuitBreakerManager;
   }
+
+  @VisibleForTesting
+  public List<CircuitBreaker> getRegisteredCircuitBreakers() {
+    return circuitBreakerList;
+  }
 }
diff --git a/solr/core/src/java/org/apache/solr/util/circuitbreaker/MemoryCircuitBreaker.java
b/solr/core/src/java/org/apache/solr/util/circuitbreaker/MemoryCircuitBreaker.java
index 629d84a..797677d 100644
--- a/solr/core/src/java/org/apache/solr/util/circuitbreaker/MemoryCircuitBreaker.java
+++ b/solr/core/src/java/org/apache/solr/util/circuitbreaker/MemoryCircuitBreaker.java
@@ -43,22 +43,25 @@ public class MemoryCircuitBreaker extends CircuitBreaker {
   private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
   private static final MemoryMXBean MEMORY_MX_BEAN = ManagementFactory.getMemoryMXBean();
 
+  private boolean enabled;
   private final long heapMemoryThreshold;
 
   // Assumption -- the value of these parameters will be set correctly before invoking getDebugInfo()
-  private final ThreadLocal<Long> seenMemory = new ThreadLocal<>();
-  private final ThreadLocal<Long> allowedMemory = new ThreadLocal<>();
+  private static final ThreadLocal<Long> seenMemory = ThreadLocal.withInitial(() ->
0L);
+  private static final ThreadLocal<Long> allowedMemory = ThreadLocal.withInitial(()
-> 0L);
 
   public MemoryCircuitBreaker(SolrConfig solrConfig) {
     super(solrConfig);
 
+    this.enabled = solrConfig.memCBEnabled;
+
     long currentMaxHeap = MEMORY_MX_BEAN.getHeapMemoryUsage().getMax();
 
     if (currentMaxHeap <= 0) {
       throw new IllegalArgumentException("Invalid JVM state for the max heap usage");
     }
 
-    int thresholdValueInPercentage = solrConfig.memoryCircuitBreakerThresholdPct;
+    int thresholdValueInPercentage = solrConfig.memCBThreshold;
     double thresholdInFraction = thresholdValueInPercentage / (double) 100;
     heapMemoryThreshold = (long) (currentMaxHeap * thresholdInFraction);
 
@@ -76,6 +79,10 @@ public class MemoryCircuitBreaker extends CircuitBreaker {
       return false;
     }
 
+    if (!enabled) {
+      return false;
+    }
+
     long localAllowedMemory = getCurrentMemoryThreshold();
     long localSeenMemory = calculateLiveMemoryUsage();
 
@@ -95,6 +102,13 @@ public class MemoryCircuitBreaker extends CircuitBreaker {
     return "seenMemory=" + seenMemory.get() + " allowedMemory=" + allowedMemory.get();
   }
 
+  @Override
+  public String getErrorMessage() {
+    return "Memory Circuit Breaker triggered as JVM heap usage values are greater than allocated
threshold." +
+        "Seen JVM heap memory usage " + seenMemory.get() + " and allocated threshold " +
+        allowedMemory.get();
+  }
+
   private long getCurrentMemoryThreshold() {
     return heapMemoryThreshold;
   }
diff --git a/solr/core/src/resources/EditableSolrConfigAttributes.json b/solr/core/src/resources/EditableSolrConfigAttributes.json
index 0ed8333..1f49982 100644
--- a/solr/core/src/resources/EditableSolrConfigAttributes.json
+++ b/solr/core/src/resources/EditableSolrConfigAttributes.json
@@ -55,8 +55,17 @@
     "queryResultMaxDocsCached":1,
     "enableLazyFieldLoading":1,
     "boolTofilterOptimizer":1,
-    "useCircuitBreakers":10,
-    "memoryCircuitBreakerThresholdPct":20,
+    "circuitBreakers":{
+      "enabled":10,
+      "memBreaker":{
+        "enabled":10,
+        "threshold":20
+      },
+      "cpuBreaker":{
+        "enabled":10,
+        "threshold":20
+      }
+    },
     "maxBooleanClauses":1},
   "jmx":{
     "agentId":0,
diff --git a/solr/core/src/test-files/solr/collection1/conf/solrconfig-memory-circuitbreaker.xml
b/solr/core/src/test-files/solr/collection1/conf/solrconfig-memory-circuitbreaker.xml
index b6b20ff..d6adf92 100644
--- a/solr/core/src/test-files/solr/collection1/conf/solrconfig-memory-circuitbreaker.xml
+++ b/solr/core/src/test-files/solr/collection1/conf/solrconfig-memory-circuitbreaker.xml
@@ -78,13 +78,10 @@
 
   </query>
 
-  <circuitBreaker>
-
-    <useCircuitBreakers>true</useCircuitBreakers>
-
-    <memoryCircuitBreakerThresholdPct>75</memoryCircuitBreakerThresholdPct>
-
-  </circuitBreaker>
+  <circuitBreakers enabled="true">
+    <cpuBreaker enabled="true" threshold="75"/>
+    <memBreaker enabled="true" threshold="75"/>
+  </circuitBreakers>
 
   <initParams path="/select">
     <lst name="defaults">
diff --git a/solr/core/src/test/org/apache/solr/core/SolrCoreTest.java b/solr/core/src/test/org/apache/solr/core/SolrCoreTest.java
index dcf593f..342fa3e 100644
--- a/solr/core/src/test/org/apache/solr/core/SolrCoreTest.java
+++ b/solr/core/src/test/org/apache/solr/core/SolrCoreTest.java
@@ -266,7 +266,8 @@ public class SolrCoreTest extends SolrTestCaseJ4 {
     assertEquals("wrong config for enableLazyFieldLoading", true, solrConfig.enableLazyFieldLoading);
     assertEquals("wrong config for queryResultWindowSize", 10, solrConfig.queryResultWindowSize);
     assertEquals("wrong config for useCircuitBreakers", false, solrConfig.useCircuitBreakers);
-    assertEquals("wrong config for memoryCircuitBreakerThresholdPct", 95, solrConfig.memoryCircuitBreakerThresholdPct);
+    assertEquals("wrong config for memoryCircuitBreakerThresholdPct", 95, solrConfig.memCBThreshold);
+    assertEquals("wrong config for cpuCircuitBreakerThreshold", 95, solrConfig.cpuCBThreshold);
   }
 
   /**
diff --git a/solr/core/src/test/org/apache/solr/util/TestCircuitBreaker.java b/solr/core/src/test/org/apache/solr/util/TestCircuitBreaker.java
index 00b8d1a..83e8fc5 100644
--- a/solr/core/src/test/org/apache/solr/util/TestCircuitBreaker.java
+++ b/solr/core/src/test/org/apache/solr/util/TestCircuitBreaker.java
@@ -17,8 +17,11 @@
 
 package org.apache.solr.util;
 
+import java.util.ArrayList;
 import java.util.HashMap;
+import java.util.List;
 import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Future;
 import java.util.concurrent.TimeUnit;
 import java.util.concurrent.atomic.AtomicInteger;
 
@@ -30,6 +33,7 @@ import org.apache.solr.common.util.ExecutorUtil;
 import org.apache.solr.common.util.SolrNamedThreadFactory;
 import org.apache.solr.core.SolrConfig;
 import org.apache.solr.search.QueryParsing;
+import org.apache.solr.util.circuitbreaker.CPUCircuitBreaker;
 import org.apache.solr.util.circuitbreaker.CircuitBreaker;
 import org.apache.solr.util.circuitbreaker.MemoryCircuitBreaker;
 import org.junit.After;
@@ -38,6 +42,9 @@ import org.junit.Rule;
 import org.junit.rules.RuleChain;
 import org.junit.rules.TestRule;
 
+import static org.hamcrest.CoreMatchers.containsString;
+
+@SuppressWarnings({"rawtypes"})
 public class TestCircuitBreaker extends SolrTestCaseJ4 {
   private final static int NUM_DOCS = 20;
 
@@ -80,6 +87,8 @@ public class TestCircuitBreaker extends SolrTestCaseJ4 {
     args.put(QueryParsing.DEFTYPE, CircuitBreaker.NAME);
     args.put(CommonParams.FL, "id");
 
+    removeAllExistingCircuitBreakers();
+
     CircuitBreaker circuitBreaker = new MockCircuitBreaker(h.getCore().getSolrConfig());
 
     h.getCore().getCircuitBreakerManager().register(circuitBreaker);
@@ -95,6 +104,8 @@ public class TestCircuitBreaker extends SolrTestCaseJ4 {
     args.put(QueryParsing.DEFTYPE, CircuitBreaker.NAME);
     args.put(CommonParams.FL, "id");
 
+    removeAllExistingCircuitBreakers();
+
     CircuitBreaker circuitBreaker = new FakeMemoryPressureCircuitBreaker(h.getCore().getSolrConfig());
 
     h.getCore().getCircuitBreakerManager().register(circuitBreaker);
@@ -115,26 +126,43 @@ public class TestCircuitBreaker extends SolrTestCaseJ4 {
     AtomicInteger failureCount = new AtomicInteger();
 
     try {
+      removeAllExistingCircuitBreakers();
+
       CircuitBreaker circuitBreaker = new BuildingUpMemoryPressureCircuitBreaker(h.getCore().getSolrConfig());
 
       h.getCore().getCircuitBreakerManager().register(circuitBreaker);
 
+      List<Future<?>> futures = new ArrayList<>();
+
       for (int i = 0; i < 5; i++) {
-        executor.submit(() -> {
+        Future<?> future = executor.submit(() -> {
           try {
             h.query(req("name:\"john smith\""));
           } catch (SolrException e) {
+
+            assertThat(e.getMessage(), containsString("Circuit Breakers tripped"));
             failureCount.incrementAndGet();
           } catch (Exception e) {
             throw new RuntimeException(e.getMessage());
           }
         });
+
+        futures.add(future);
+      }
+
+      for  (Future<?> future : futures) {
+        try {
+          future.get();
+        } catch (Exception e) {
+          throw new RuntimeException(e.getMessage());
+        }
       }
 
       executor.shutdown();
       try {
         executor.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
       } catch (InterruptedException e) {
+        Thread.currentThread().interrupt();
         throw new RuntimeException(e.getMessage());
       }
 
@@ -146,6 +174,59 @@ public class TestCircuitBreaker extends SolrTestCaseJ4 {
     }
   }
 
+  public void testFakeCPUCircuitBreaker() {
+    AtomicInteger failureCount = new AtomicInteger();
+
+    ExecutorService executor = ExecutorUtil.newMDCAwareCachedThreadPool(
+        new SolrNamedThreadFactory("TestCircuitBreaker"));
+    try {
+      removeAllExistingCircuitBreakers();
+
+      CircuitBreaker circuitBreaker = new FakeCPUCircuitBreaker(h.getCore().getSolrConfig());
+
+      h.getCore().getCircuitBreakerManager().register(circuitBreaker);
+
+      List<Future<?>> futures = new ArrayList<>();
+
+      for (int i = 0; i < 5; i++) {
+        Future<?> future = executor.submit(() -> {
+          try {
+            h.query(req("name:\"john smith\""));
+          } catch (SolrException e) {
+            assertThat(e.getMessage(), containsString("Circuit Breakers tripped"));
+            failureCount.incrementAndGet();
+          } catch (Exception e) {
+            throw new RuntimeException(e.getMessage());
+          }
+        });
+
+        futures.add(future);
+      }
+
+      for  (Future<?> future : futures) {
+        try {
+          future.get();
+        } catch (Exception e) {
+          throw new RuntimeException(e.getMessage());
+        }
+      }
+
+      executor.shutdown();
+      try {
+        executor.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
+      } catch (InterruptedException e) {
+        Thread.currentThread().interrupt();
+        throw new RuntimeException(e.getMessage());
+      }
+
+      assertEquals("Number of failed queries is not correct",5, failureCount.get());
+    } finally {
+      if (!executor.isShutdown()) {
+        executor.shutdown();
+      }
+    }
+  }
+
   public void testResponseWithCBTiming() {
     assertQ(req("q", "*:*", CommonParams.DEBUG_QUERY, "true"),
         "//str[@name='rawquerystring']='*:*'",
@@ -168,7 +249,13 @@ public class TestCircuitBreaker extends SolrTestCaseJ4 {
     );
   }
 
-  private class MockCircuitBreaker extends CircuitBreaker {
+  private void removeAllExistingCircuitBreakers() {
+    List<CircuitBreaker> registeredCircuitBreakers = h.getCore().getCircuitBreakerManager().getRegisteredCircuitBreakers();
+
+    registeredCircuitBreakers.clear();
+  }
+
+  private static class MockCircuitBreaker extends MemoryCircuitBreaker {
 
     public MockCircuitBreaker(SolrConfig solrConfig) {
       super(solrConfig);
@@ -186,7 +273,7 @@ public class TestCircuitBreaker extends SolrTestCaseJ4 {
     }
   }
 
-  private class FakeMemoryPressureCircuitBreaker extends MemoryCircuitBreaker {
+  private static class FakeMemoryPressureCircuitBreaker extends MemoryCircuitBreaker {
 
     public FakeMemoryPressureCircuitBreaker(SolrConfig solrConfig) {
       super(solrConfig);
@@ -215,4 +302,15 @@ public class TestCircuitBreaker extends SolrTestCaseJ4 {
       return 5; // Random number guaranteed to not trip the circuit breaker
     }
   }
+
+  private static class FakeCPUCircuitBreaker extends CPUCircuitBreaker {
+    public FakeCPUCircuitBreaker(SolrConfig solrConfig) {
+      super(solrConfig);
+    }
+
+    @Override
+    protected double calculateLiveCPUUsage() {
+      return 92; // Return a value large enough to trigger the circuit breaker
+    }
+  }
 }
diff --git a/solr/server/solr/configsets/_default/conf/solrconfig.xml b/solr/server/solr/configsets/_default/conf/solrconfig.xml
index cdd56cf..85c5671 100644
--- a/solr/server/solr/configsets/_default/conf/solrconfig.xml
+++ b/solr/server/solr/configsets/_default/conf/solrconfig.xml
@@ -598,27 +598,24 @@
      Circuit Breaker Section - This section consists of configurations for
      circuit breakers
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -->
-  <circuitBreaker>
-    <!-- Enable Circuit Breakers
+
+    <!-- Circuit Breakers
 
      Circuit breakers are designed to allow stability and predictable query
      execution. They prevent operations that can take down the node and cause
      noisy neighbour issues.
 
      This flag is the uber control switch which controls the activation/deactivation of all
circuit
-     breakers. At the moment, the only circuit breaker (max JVM circuit breaker) does not
have its
-     own specific configuration. However, if a circuit breaker wishes to be independently
configurable,
+     breakers. If a circuit breaker wishes to be independently configurable,
      they are free to add their specific configuration but need to ensure that this flag
is always
      respected - this should have veto over all independent configuration flags.
     -->
-    <!--
-   <useCircuitBreakers>true</useCircuitBreakers>
-    -->
+    <circuitBreakers enabled="true">
 
-    <!-- Memory Circuit Breaker Threshold In Percentage
+    <!-- Memory Circuit Breaker Configuration
 
-     Specific configuration for max JVM heap usage circuit breaker. This configuration defines
the
-     threshold percentage of maximum heap allocated beyond which queries will be rejected
until the
+     Specific configuration for max JVM heap usage circuit breaker. This configuration defines
whether
+     the circuit breaker is enabled and the threshold percentage of maximum heap allocated
beyond which queries will be rejected until the
      current JVM usage goes below the threshold. The valid value range for this value is
50-95.
 
      Consider a scenario where the max heap allocated is 4 GB and memoryCircuitBreakerThreshold
is
@@ -629,12 +626,31 @@
      If you see queries getting rejected with 503 error code, check for "Circuit Breakers
tripped"
      in logs and the corresponding error message should tell you what transpired (if the
failure
      was caused by tripped circuit breakers).
+
+     If, at any point, the current JVM heap usage goes above 3 GB, queries will be rejected
until the heap usage goes below 3 GB again.
+     If you see queries getting rejected with 503 error code, check for "Circuit Breakers
tripped"
+     in logs and the corresponding error message should tell you what transpired (if the
failure
+     was caused by tripped circuit breakers).
     -->
     <!--
-   <memoryCircuitBreakerThresholdPct>100</memoryCircuitBreakerThresholdPct>
+   <memBreaker enabled="true" threshold="75"/>
     -->
 
-  </circuitBreaker>
+      <!-- CPU Circuit Breaker Configuration
+
+     Specific configuration for CPU utilization based circuit breaker. This configuration
defines whether the circuit breaker is enabled
+     and the average load over the last minute at which the circuit breaker should start
rejecting queries.
+
+     Consider a scenario where the max heap allocated is 4 GB and memoryCircuitBreakerThreshold
is
+     defined as 75. Threshold JVM usage will be 4 * 0.75 = 3 GB. Its generally a good idea
to keep this value between 75 - 80% of maximum heap
+     allocated.
+    -->
+
+      <!--
+       <cpuBreaker enabled="true" threshold="75"/>
+      -->
+
+  </circuitBreakers>
 
 
   <!-- Request Dispatcher
diff --git a/solr/solr-ref-guide/src/circuit-breakers.adoc b/solr/solr-ref-guide/src/circuit-breakers.adoc
index 1629c32..56451b0 100644
--- a/solr/solr-ref-guide/src/circuit-breakers.adoc
+++ b/solr/solr-ref-guide/src/circuit-breakers.adoc
@@ -32,9 +32,14 @@ will be disabled globally. Per circuit breaker configurations are specified
in t
 
 [source,xml]
 ----
-<useCircuitBreakers>false</useCircuitBreakers>
+<circuitBreakers enabled="true">
+  <!-- All specific configs in this section -->
+</circuitBreakers>
 ----
 
+This flag acts as the highest authority and global controller of circuit breakers. For using
specific circuit breakers, each one
+needs to be individually enabled in addition to this flag being enabled.
+
 == Currently Supported Circuit Breakers
 
 === JVM Heap Usage Based Circuit Breaker
@@ -42,26 +47,43 @@ This circuit breaker tracks JVM heap memory usage and rejects incoming
search re
 exceeds a configured percentage of maximum heap allocated to the JVM (-Xmx). The main configuration
for this circuit breaker is
 controlling the threshold percentage at which the breaker will trip.
 
-It does not logically make sense to have a threshold below 50% and above 95% of the max heap
allocated to the JVM. Hence, the range
-of valid values for this parameter is [50, 95], both inclusive.
+Configuration for JVM heap usage based circuit breaker:
 
 [source,xml]
 ----
-<memoryCircuitBreakerThresholdPct>75</memoryCircuitBreakerThresholdPct>
+<memBreaker enabled="true" threshold="75"/>
 ----
 
+Note that this configuration will be overridden by the global circuit breaker flag -- if
circuit breakers are disabled, this flag
+will not help you. Also, the triggering threshold is defined as a percentage of the max heap
allocated to the JVM.
+
+It does not logically make sense to have a threshold below 50% and above 95% of the max heap
allocated to the JVM. Hence, the range
+of valid values for this parameter is [50, 95], both inclusive.
+
 Consider the following example:
 
 JVM has been allocated a maximum heap of 5GB (-Xmx) and memoryCircuitBreakerThresholdPct
is set to 75. In this scenario, the heap usage
 at which the circuit breaker will trip is 3.75GB.
 
-Note that this circuit breaker is checked for each incoming search request and considers
the current heap usage of the node, i.e every search
-request will get the live heap usage and compare it against the set memory threshold. The
check does not impact performance,
-but any performance regressions that are suspected to be caused by this feature should be
reported to the dev list.
 
+=== CPU Utilization Based Circuit Breaker
+This circuit breaker tracks CPU utilization and triggers if the average CPU utilization over
the last one minute
+exceeds a configurable threshold. Note that the value used in computation is over the last
one minute -- so a sudden
+spike in traffic that goes down might still cause the circuit breaker to trigger for a short
while before it resolves
+and updates the value. For more details of the calculation, please see https://en.wikipedia.org/wiki/Load_(computing)
+
+Configuration for CPU utilization based circuit breaker:
+
+[source,xml]
+----
+<cpuBreaker enabled="true" threshold="20"/>
+----
+
+Note that this configuration will be overridden by the global circuit breaker flag -- if
circuit breakers are disabled, this flag
+will not help you. The triggering threshold is defined in units of CPU utilization.
 
 == Performance Considerations
-It is worth noting that while JVM circuit breaker does not add any noticeable overhead per
query, having too many
+It is worth noting that while JVM or CPU circuit breakers do not add any noticeable overhead
per query, having too many
 circuit breakers checked for a single request can cause a performance overhead.
 
 In addition, it is a good practice to exponentially back off while retrying requests on a
busy node.


Mime
View raw message