From user-return-25656-archive-asf-public=cust-asf.ponee.io@ignite.apache.org Mon Apr 15 09:24:49 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 496B418064C for ; Mon, 15 Apr 2019 11:24:49 +0200 (CEST) Received: (qmail 9727 invoked by uid 500); 15 Apr 2019 09:14:11 -0000 Mailing-List: contact user-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@ignite.apache.org Delivered-To: mailing list user@ignite.apache.org Received: (qmail 9712 invoked by uid 99); 15 Apr 2019 09:14:11 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Apr 2019 09:14:11 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 71960C087B for ; Mon, 15 Apr 2019 09:14:10 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.8 X-Spam-Level: * X-Spam-Status: No, score=1.8 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id Mw68Jmy-l_1h for ; Mon, 15 Apr 2019 09:14:08 +0000 (UTC) Received: from mail-ua1-f45.google.com (mail-ua1-f45.google.com [209.85.222.45]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id A41C95FB79 for ; Mon, 15 Apr 2019 09:05:35 +0000 (UTC) Received: by mail-ua1-f45.google.com with SMTP id p13so5242650uaa.11 for ; Mon, 15 Apr 2019 02:05:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=xu1q8AlEsagB23mcaaZCg9HWN3euzJkf9lA5XxDQ+yE=; b=Ep4BBXkszPt6HdiCGQ80yNLhjU+pIdaxseFkHBAm4Qmnu2oOB3kIg8d3wEKL8E8I2E 3lP6bMqAOjPvVbXJilAziv4agHZPihC/BF8EOMekHUS2wmZHJ8XZgPU9j+8N14UWuD8A WE3XqH40IO9YeQ6vzZnKiKXodrNEpcLgIgdskvmB6bZksQJfUsswWnO0B9Iio019nLwj Onpt0vXJ8zZYERg6HxDrKp4wJZbbc9Zr3YMPn09WNhWtVuomNbq3ssDolxzLRSXqsBsl viR8pge8ApvvRGCSf5+09JIa4Dqd5Zku1Vwu9uXVXvhF0UFp8sBXayVjiQv7pMYytqUI x1SA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=xu1q8AlEsagB23mcaaZCg9HWN3euzJkf9lA5XxDQ+yE=; b=U0ib7sF0eBbQm37dlzOF6cHjopnx/woPbIoNRRdjb20KUOUXDlXTHtlQ//LhLcQcyx 7t6CFwvzyzgVGL6dnQpqjPZy1zTHOhSfEt6m2m/nPnUyb1Z189terJivaS/+OvB0QxPo FbXflsdLYCbAc+Wt/uVw3nSfT8l22b8nBXJEwHu3TmlraKU8vq3sI0lecKQhig+Cc9Cf sQzuIwKWJLOINvsFNvSaDVyCsbq3IdUYjOf14GYWbNBjrLF4j3OKhD1isnSD3ycwc2Un R8/A/D62cx5zEBQdRREvYEgsuBZmamlLSs0ILoTCbfy9ttaJ/q6xYMGI3mqT9ipm2gnG JMBw== X-Gm-Message-State: APjAAAXF7F/VniCpF4KNJwUuElD/TC1L7ihF2wHUTdD4aRUv9FrZHFg7 YD/vMby3QmmrkqMe5cwoC4ltqajoBn2fGsljhMe63A== X-Google-Smtp-Source: APXvYqzVxyONocsxe6SKPiqP8MccZUV1yCFRBZfHdx7GaWQoZmijxBVWax4+a6+CYT0DTu+Yvj/VTX2Vu8CVxlF6rnE= X-Received: by 2002:ab0:3390:: with SMTP id y16mr1806140uap.45.1555319127773; Mon, 15 Apr 2019 02:05:27 -0700 (PDT) MIME-Version: 1.0 References: <1555256738538-0.post@n6.nabble.com> In-Reply-To: <1555256738538-0.post@n6.nabble.com> From: Ilya Kasnacheev Date: Mon, 15 Apr 2019 12:05:16 +0300 Message-ID: Subject: Re: Ignite DataStreamer Memory Problems To: user@ignite.apache.org Content-Type: multipart/alternative; boundary="0000000000001f95b605868df475" --0000000000001f95b605868df475 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hello! DataStreamer WILL block until all data is loaded in caches. The recommendation here is probably reducing perNodeParallelOperations(), streamerBufferSize() and perThreadBufferSize(), and flush()ing your DataStreamer frequently to avoid data build-ups in temporary data structures of DataStreamer. Or maybe, if you have a few entries which are very large, you can just use Cache API to populate those. Regards, --=20 Ilya Kasnacheev =D0=B2=D1=81, 14 =D0=B0=D0=BF=D1=80. 2019 =D0=B3. =D0=B2 18:45, kellan : > I seem to be running into some sort of memory issues with my DataStreamer= s > and I'd like to get a better idea of how they work behind the scenes to > troubleshoot my problem. > > I have a cluster of 4 nodes, each of which is pulling files from S3 over = an > extended period of time and loading the contents. Each new opens up a new > DataStreamer, loads its contents and closes the DataStreamer. At most eac= h > cache has 4 DataStreamers writing to 4 different caches simultaneously. A > new DataStreamer isn't created until the last one on that thread is close= d. > I wait for the futures to complete, then close the DataStreamer. So far s= o > good. > > After my nodes are running for a few hours, one or more inevitably ends u= p > crashing. Sometimes the Java heap overflows and Java exits, and sometimes > Java is killed by the kernel because of an OOM error. > > Here are my specs per node: > Total Available Memory: 110GB > Memory Assigned to All Data Regions: 50GB > Total Checkpoint Page Buffers: 5GB > Java Heap: 25GB > > Does DataStreamer.close block until data is loaded into the cache on remo= te > nodes (I'm assuming it doesn't), and if not is there anyway to monitor th= e > progress loading data in the cache on the remote nodes/replicas, so I can > slow down my DataStreamers to keep pace? > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ > --0000000000001f95b605868df475 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello!

DataStreamer WILL blo= ck until all data is loaded in caches.

The recomme= ndation here is probably reducing perNodeParallelOperations(), streamerBuff= erSize() and perThreadBufferSize(), and flush()ing your DataStreamer freque= ntly to avoid data build-ups in temporary data structures of DataStreamer. = Or maybe, if you have a few entries which are very large, you can just use = Cache API to populate those.

Regards,
--
Ilya Kasnacheev

<= br>
=D0=B2= =D1=81, 14 =D0=B0=D0=BF=D1=80. 2019 =D0=B3. =D0=B2 18:45, kellan <kellan.burket@gmail.com>:
I seem to be running = into some sort of memory issues with my DataStreamers
and I'd like to get a better idea of how they work behind the scenes to=
troubleshoot my problem.

I have a cluster of 4 nodes, each of which is pulling files from S3 over an=
extended period of time and loading the contents. Each new opens up a new DataStreamer, loads its contents and closes the DataStreamer. At most each<= br> cache has 4 DataStreamers writing to 4 different caches simultaneously. A new DataStreamer isn't created until the last one on that thread is clo= sed.
I wait for the futures to complete, then close the DataStreamer. So far so<= br> good.

After my nodes are running for a few hours, one or more inevitably ends up<= br> crashing. Sometimes the Java heap overflows and Java exits, and sometimes Java is killed by the kernel because of an OOM error.

Here are my specs per node:
Total Available Memory: 110GB
Memory Assigned to All Data Regions: 50GB
Total Checkpoint Page Buffers: 5GB
Java Heap: 25GB

Does DataStreamer.close block until data is loaded into the cache on remote=
nodes (I'm assuming it doesn't), and if not is there anyway to moni= tor the
progress loading data in the cache on the remote nodes/replicas, so I can slow down my DataStreamers to keep pace?



--
Sent from: http://apache-ignite-users.70518.x6.nabbl= e.com/
--0000000000001f95b605868df475--