Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 8BE67200C70 for ; Thu, 4 May 2017 10:57:59 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 8A83A160BB0; Thu, 4 May 2017 08:57:59 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id D0F49160B9F for ; Thu, 4 May 2017 10:57:58 +0200 (CEST) Received: (qmail 52452 invoked by uid 500); 4 May 2017 08:57:53 -0000 Mailing-List: contact user-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@ignite.apache.org Delivered-To: mailing list user@ignite.apache.org Received: (qmail 52435 invoked by uid 99); 4 May 2017 08:57:51 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 May 2017 08:57:51 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id C399218832E for ; Thu, 4 May 2017 08:57:50 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.501 X-Spam-Level: * X-Spam-Status: No, score=1.501 tagged_above=-999 required=6.31 tests=[FROM_WORDY=0.001, KAM_LAZY_DOMAIN_SECURITY=1, KAM_NUMSUBJECT=0.5, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id tqnwKzyEAc7u for ; Thu, 4 May 2017 08:57:48 +0000 (UTC) Received: from mwork.nabble.com (mwork.nabble.com [162.253.133.43]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 6BF075FB32 for ; Thu, 4 May 2017 08:57:48 +0000 (UTC) Received: from static.162.255.23.37.macminivault.com (unknown [162.255.23.37]) by mwork.nabble.com (Postfix) with ESMTP id 527B93EE504CD for ; Thu, 4 May 2017 01:57:47 -0700 (MST) Date: Thu, 4 May 2017 01:57:47 -0700 (MST) From: tysli2016 To: user@ignite.apache.org Message-ID: <1493888267336-12409.post@n6.nabble.com> Subject: OOME on 2-node cluster with visor running repeatedly, Ignite 1.9 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit archived-at: Thu, 04 May 2017 08:57:59 -0000 Got "OutOfMemoryError: Java heap space" with 2-node cluster with a `visor` running repeatedly. The server nodes are running on CentOS 7 inside Oracle VirtualBox VM with the same config: - 2 vCPUs - 3.5GB memory - Oracle JDK 2.8.0_121 `default-config.xml` was modified to use non-default multicast group and 1 backup: The `visor` was running repeatedly in one of the nodes by a shell script: #!/bin/bash IGNITE_HOME=/root/apache-ignite-fabric-1.9.0-bin while true do ${IGNITE_HOME}/bin/ignitevisorcmd.sh -e="'open -cpath=${IGNITE_HOME}/config/default-config.xml;node'" done The OOME thrown after the above settings running for 1 day. I have put ignite log, gc log, heap dump in `dee657c8.tgz`, which could be downloaded from https://drive.google.com/drive/folders/0BwY2dxDlRYhBSFJhS0ZWOVBiNk0?usp=sharing. `507f0201.tgz` contains ignite log and gc log from another node in the cluster, for reference just in case. Running `visor` repeatedly is just to reproduce the OOME more quickly, in production we run the `visor` once per 10 minutes to monitor the healthiness of the cluster. Questions: 1. Anything wrong with the configuration? Anything can be tuned to avoid OOME? 2. Is there any other built-in tools allow one to monitor the cluster, showing no. of server nodes is good enough. -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/OOME-on-2-node-cluster-with-visor-running-repeatedly-Ignite-1-9-tp12409.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.