Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 8C0DB2009F3 for ; Fri, 20 May 2016 23:39:17 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 8AA3C160A25; Fri, 20 May 2016 21:39:17 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id AB5541607AA for ; Fri, 20 May 2016 23:39:16 +0200 (CEST) Received: (qmail 71795 invoked by uid 500); 20 May 2016 21:39:15 -0000 Mailing-List: contact dev-help@phoenix.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@phoenix.apache.org Delivered-To: mailing list dev@phoenix.apache.org Received: (qmail 71784 invoked by uid 99); 20 May 2016 21:39:15 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 May 2016 21:39:15 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 7288DC0828 for ; Fri, 20 May 2016 21:39:15 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -4.646 X-Spam-Level: X-Spam-Status: No, score=-4.646 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.426] autolearn=disabled Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id w8SONNjUBz0F for ; Fri, 20 May 2016 21:39:14 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with SMTP id 003445F295 for ; Fri, 20 May 2016 21:39:13 +0000 (UTC) Received: (qmail 70973 invoked by uid 99); 20 May 2016 21:39:13 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 May 2016 21:39:13 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 06EA22C1F58 for ; Fri, 20 May 2016 21:39:13 +0000 (UTC) Date: Fri, 20 May 2016 21:39:13 +0000 (UTC) From: "James Taylor (JIRA)" To: dev@phoenix.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (PHOENIX-2929) Transactional writes get slow over time due to possible memory leak MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 20 May 2016 21:39:17 -0000 [ https://issues.apache.org/jira/browse/PHOENIX-2929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15294293#comment-15294293 ] James Taylor commented on PHOENIX-2929: --------------------------------------- The odd thing, [~poornachandra], is that the client JVM is brought down after inserting 1M rows, let subsequent clients begin getting slower over time. Restarting the transaction manager makes it start going faster again. Any theories? > Transactional writes get slow over time due to possible memory leak > ------------------------------------------------------------------- > > Key: PHOENIX-2929 > URL: https://issues.apache.org/jira/browse/PHOENIX-2929 > Project: Phoenix > Issue Type: Bug > Reporter: Mujtaba Chohan > Attachments: txn-manager-log-during-slow-writes.log > > > Write to transactional table get up to ~3X slower compared to non-transactional table. > Details: > * Transaction Manager (TM) is *always* kept up and running between all data loads > * JVM for data loader is stopped after each 1M rows upsert > * Data loader uses 10 threads to write in parallel with 10K batch size > * Data load time is comparable for first few runs and it gets ~3X slow after transaction manager is up for 5-10 hours > Data loader client details for txn-table: ~100sec GC pause with transactional table. Total time to load 1M rows ~4mins. ~3.2G physical mem used. Java XMX param was set to 3G. > {code} > JSTAT for data loader - 10 sec interval > S0C S1C S0U S1U EC EU OC OU MC MU CCSC CCSU YGC YGCT FGC FGCT GCT > 40448.0 40960.0 0.0 21920.0 571904.0 0.0 349696.0 87959.7 42024.0 41469.1 4904.0 4739.9 13 0.174 2 0.072 0.246 > 333824.0 333824.0 0.0 121135.1 380928.0 84826.0 2097152.0 2009035.3 42280.0 41652.1 4904.0 4740.4 29 1.172 6 5.231 6.403 > 349184.0 349184.0 0.0 0.0 350208.0 350205.8 2097152.0 1952578.1 42536.0 41758.6 4904.0 4742.5 36 1.353 12 9.882 11.234 > 322560.0 336896.0 214566.6 104273.9 344576.0 344561.2 2097152.0 1859305.5 42536.0 41774.5 4904.0 4742.5 45 1.643 17 14.684 16.327 > 349184.0 349184.0 41351.1 0.0 350208.0 48147.1 2097152.0 1935605.0 42536.0 41811.3 4904.0 4742.5 54 2.081 24 19.862 21.943 > 349184.0 349184.0 156685.4 0.0 350208.0 0.0 2097152.0 2073856.9 42536.0 41817.5 4904.0 4742.5 68 2.632 30 22.649 25.281 > 344576.0 52224.0 0.0 51789.6 350208.0 0.0 2097152.0 1871862.2 42536.0 41823.4 4904.0 4742.5 79 2.991 37 27.764 30.756 > 349184.0 349184.0 302724.6 0.0 350208.0 0.0 2097152.0 2023258.1 42536.0 41830.3 4904.0 4742.5 92 3.559 43 30.363 33.922 > 305664.0 104448.0 0.0 104434.1 358400.0 114713.9 2097152.0 1857545.7 42536.0 41846.0 4904.0 4742.5 101 3.804 50 36.314 40.118 > 335872.0 66048.0 0.0 65680.0 350208.0 247000.5 2097152.0 1923473.1 42536.0 41852.3 4904.0 4742.5 115 4.364 56 39.622 43.986 > 349184.0 349184.0 0.0 349163.6 350208.0 0.0 2097152.0 1975419.5 42536.0 41861.6 4904.0 4742.5 125 4.863 60 44.782 49.645 > 349184.0 349184.0 0.0 17375.8 350208.0 0.0 2097152.0 1999279.6 42536.0 41863.4 4904.0 4742.5 133 5.091 66 50.602 55.693 > 340992.0 345088.0 0.0 0.0 350208.0 350208.0 2097152.0 1977601.3 42536.0 41894.1 4904.0 4742.5 146 5.566 73 54.388 59.953 > 349184.0 349184.0 0.0 0.0 350208.0 350208.0 2097152.0 1963150.5 42536.0 41906.5 4904.0 4742.5 158 6.036 80 58.024 64.060 > 349184.0 349184.0 0.0 63008.0 350208.0 56631.7 2097152.0 1942708.7 42536.0 41916.8 4904.0 4742.5 177 6.751 84 60.799 67.550 > 345600.0 349184.0 263005.5 0.0 350208.0 55075.8 2097152.0 1883022.2 42536.0 41925.8 4904.0 4742.5 192 7.295 88 65.144 72.439 > 349184.0 344064.0 139656.9 0.0 355328.0 0.0 2097152.0 2070558.8 42536.0 41935.4 4904.0 4742.5 204 7.803 95 68.618 76.421 > 349184.0 349184.0 63008.0 0.0 350208.0 0.0 2097152.0 1963035.3 42536.0 41971.9 4904.0 4742.5 218 8.339 102 72.534 80.873 > 349184.0 349184.0 0.0 17984.0 350208.0 253707.1 2097152.0 1848396.0 42536.0 41996.1 4904.0 4742.5 229 8.686 108 76.526 85.212 > 349184.0 324608.0 0.0 284036.3 351744.0 0.0 2097152.0 1776761.8 42536.0 42021.6 4904.0 4742.5 247 9.320 113 79.669 88.989 > 349184.0 349184.0 38510.2 0.0 350208.0 160454.7 2097152.0 1896405.6 42536.0 42036.0 4904.0 4742.5 256 9.555 115 80.723 90.278 > 342528.0 333824.0 147271.1 0.0 358912.0 312219.3 2097152.0 1882525.2 42792.0 42065.0 4904.0 4742.5 278 10.620 116 81.143 91.763 > 349184.0 283648.0 0.0 283645.6 370176.0 249240.7 2097152.0 1893727.0 42792.0 42066.5 4904.0 4742.5 303 11.738 117 81.684 93.422 > 349184.0 349184.0 49417.0 0.0 350208.0 172262.6 2097152.0 1127855.3 42792.0 42071.0 4904.0 4742.5 326 12.787 119 82.895 95.682 > 349184.0 349184.0 0.0 131245.0 350208.0 133019.3 2097152.0 1074752.6 42792.0 42081.4 4904.0 4742.5 349 13.943 120 83.448 97.391 > {code} > Data loader client details for non-txn table: ~12sec GC pause with non-transactional table. Total time to load 1M rows ~1mins. ~2G physical mem used. Java XMX param was set to 3G. > {code} > JSTAT for data loader - 10 sec interval > S0C S1C S0U S1U EC EU OC OU MC MU CCSC CCSU YGC YGCT FGC FGCT GCT > 24576.0 28160.0 0.0 0.0 195584.0 28008.1 349696.0 25257.3 35496.0 35020.3 4264.0 4037.5 6 0.056 2 0.071 0.128 > 79872.0 79872.0 0.0 79860.8 888832.0 0.0 753664.0 753092.4 41000.0 40361.6 4648.0 4563.6 17 0.616 3 0.071 0.688 > 259584.0 260096.0 129075.4 161535.6 528384.0 528384.0 937472.0 767859.8 41000.0 40458.7 4648.0 4564.6 31 1.319 4 1.622 2.941 > 283136.0 278016.0 23716.8 0.0 487424.0 143734.6 1498624.0 662740.5 41000.0 40479.8 4648.0 4564.6 48 2.459 7 3.122 5.582 > 240640.0 251392.0 162230.5 31674.1 545792.0 545792.0 1530368.0 642312.2 41000.0 40531.4 4648.0 4567.6 69 3.733 9 4.069 7.802 > 289280.0 282624.0 19730.4 0.0 468480.0 403700.7 1786880.0 672352.5 41256.0 40645.2 4648.0 4567.6 86 4.974 11 5.160 10.134 > 282624.0 290816.0 163586.1 0.0 466944.0 268299.4 1950208.0 1491203.8 41256.0 40667.0 4648.0 4567.6 108 6.288 12 5.723 12.011 > {code} > * TM uses ~4G physical memory (there's no GC pause for TM) > * After restarting TM, TM memory usage drops to 0.5G and data loader GC pauses decrease to ~15sec for transactional table and 1M gets loaded in ~1min. > This hints to possible leak in TM. Since this test was on a remote secure cluster, I haven't been able to profile it so I'll try to repro it locally to get more info. -- This message was sent by Atlassian JIRA (v6.3.4#6332)