From dev-return-63083-archive-asf-public=cust-asf.ponee.io@activemq.apache.org Fri Jan 5 23:45:31 2018 Return-Path: X-Original-To: archive-asf-public@eu.ponee.io Delivered-To: archive-asf-public@eu.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by mx-eu-01.ponee.io (Postfix) with ESMTP id AB82F180647 for ; Fri, 5 Jan 2018 23:45:31 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 9B39F160C27; Fri, 5 Jan 2018 22:45:31 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id E1A02160C15 for ; Fri, 5 Jan 2018 23:45:30 +0100 (CET) Received: (qmail 50344 invoked by uid 500); 5 Jan 2018 22:45:29 -0000 Mailing-List: contact dev-help@activemq.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@activemq.apache.org Delivered-To: mailing list dev@activemq.apache.org Received: (qmail 50333 invoked by uid 99); 5 Jan 2018 22:45:29 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Jan 2018 22:45:29 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 94EEDDFCDE; Fri, 5 Jan 2018 22:45:29 +0000 (UTC) From: superawesome To: dev@activemq.apache.org Reply-To: dev@activemq.apache.org Message-ID: Subject: [GitHub] activemq pull request #272: replicated LevelDB fix and debugging output Content-Type: text/plain Date: Fri, 5 Jan 2018 22:45:29 +0000 (UTC) GitHub user superawesome opened a pull request: https://github.com/apache/activemq/pull/272 replicated LevelDB fix and debugging output I know LevelDB is now deprecated, and this may not get merged because of that. I certainly don't want to become its maintainer. I had an unhealthy cluster, and did not want to try and migrate it while in that state. This is just a small change to get it healthy again after a slave encountered this error: Unexpected session error: java.net.ProtocolException: Maximum protocol buffer length exeeded In my case I believe this is due to having too many LevelDB log files to replicate (a separate LevelDB bug, which I intend to investigate now that this cluster is healthy again). 2 commits in this PR: 1) debugging output to track that down and make it more ... debuggable. 2) a 2-character change to increase a buffer size by 4x, to "fix" the problem. You can merge this pull request into a Git repository by running: $ git pull https://github.com/superawesome/activemq master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/activemq/pull/272.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #272 ---- commit 4549f8ebbad4fb9191ef99da48d053864b31916c Author: Jake Maul Date: 2018-01-05T22:27:44Z adding stack trace output to debug log level Tracing a connection failure was proving problematic. The session error is a warning, but the particular error *I* was having is actually rather fatal, as restarting the slave connections does not fix anything- it just fails again (and again, and again...). This will land in the log file iff debug logging is enabled. Otherwise this is a no-op. commit ce29c2f29c3d9f4ae5d930e57e573f69f09521f4 Author: Jake Maul Date: 2018-01-05T22:35:10Z larger buffer for reading replication frames The default size (1024*64=65536 bytes) was insufficient on a cluster I manage. I think because of an unrelated bug where LevelDB log files don't get purged. This is a simple fix to get it going again. Making it 4x as big was *not* scientifically determined. I tried making it 4x as big first, it worked, and I haven't tried any other size. I don't know why it was 64k in the first place, so I can't be sure this doesn't cause any side effects. ---- ---