phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Taylor (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-2929) Transactional writes get slow over time due to possible memory leak
Date Fri, 20 May 2016 21:39:13 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-2929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15294293#comment-15294293
] 

James Taylor commented on PHOENIX-2929:
---------------------------------------

The odd thing, [~poornachandra], is that the client JVM is brought down after inserting 1M
rows, let subsequent clients begin getting slower over time. Restarting the transaction manager
makes it start going faster again. Any theories?

> Transactional writes get slow over time due to possible memory leak
> -------------------------------------------------------------------
>
>                 Key: PHOENIX-2929
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2929
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Mujtaba Chohan
>         Attachments: txn-manager-log-during-slow-writes.log
>
>
> Write to transactional table get up to ~3X slower compared to non-transactional table.
 
> Details: 
> * Transaction Manager (TM) is *always* kept up and running between all data loads
> * JVM for data loader is stopped after each 1M rows upsert
> * Data loader uses 10 threads to write in parallel with 10K batch size
> * Data load time is comparable for first few runs and it gets ~3X slow after transaction
manager is up for 5-10 hours
> Data loader client details for txn-table: ~100sec GC pause with transactional table.
Total time to load 1M rows ~4mins. ~3.2G physical mem used. Java XMX param was set to 3G.
> {code}
> JSTAT for data loader - 10 sec interval
>  S0C    S1C    S0U    S1U      EC       EU        OC         OU       MC     MU    CCSC
  CCSU   YGC     YGCT    FGC    FGCT     GCT   
> 40448.0 40960.0  0.0   21920.0 571904.0   0.0     349696.0   87959.7   42024.0 41469.1
4904.0 4739.9     13    0.174   2      0.072    0.246
> 333824.0 333824.0  0.0   121135.1 380928.0 84826.0  2097152.0  2009035.3  42280.0 41652.1
4904.0 4740.4     29    1.172   6      5.231    6.403
> 349184.0 349184.0  0.0    0.0   350208.0 350205.8 2097152.0  1952578.1  42536.0 41758.6
4904.0 4742.5     36    1.353  12      9.882   11.234
> 322560.0 336896.0 214566.6 104273.9 344576.0 344561.2 2097152.0  1859305.5  42536.0 41774.5
4904.0 4742.5     45    1.643  17     14.684   16.327
> 349184.0 349184.0 41351.1  0.0   350208.0 48147.1  2097152.0  1935605.0  42536.0 41811.3
4904.0 4742.5     54    2.081  24     19.862   21.943
> 349184.0 349184.0 156685.4  0.0   350208.0   0.0    2097152.0  2073856.9  42536.0 41817.5
4904.0 4742.5     68    2.632  30     22.649   25.281
> 344576.0 52224.0  0.0   51789.6 350208.0   0.0    2097152.0  1871862.2  42536.0 41823.4
4904.0 4742.5     79    2.991  37     27.764   30.756
> 349184.0 349184.0 302724.6  0.0   350208.0   0.0    2097152.0  2023258.1  42536.0 41830.3
4904.0 4742.5     92    3.559  43     30.363   33.922
> 305664.0 104448.0  0.0   104434.1 358400.0 114713.9 2097152.0  1857545.7  42536.0 41846.0
4904.0 4742.5    101    3.804  50     36.314   40.118
> 335872.0 66048.0  0.0   65680.0 350208.0 247000.5 2097152.0  1923473.1  42536.0 41852.3
4904.0 4742.5    115    4.364  56     39.622   43.986
> 349184.0 349184.0  0.0   349163.6 350208.0   0.0    2097152.0  1975419.5  42536.0 41861.6
4904.0 4742.5    125    4.863  60     44.782   49.645
> 349184.0 349184.0  0.0   17375.8 350208.0   0.0    2097152.0  1999279.6  42536.0 41863.4
4904.0 4742.5    133    5.091  66     50.602   55.693
> 340992.0 345088.0  0.0    0.0   350208.0 350208.0 2097152.0  1977601.3  42536.0 41894.1
4904.0 4742.5    146    5.566  73     54.388   59.953
> 349184.0 349184.0  0.0    0.0   350208.0 350208.0 2097152.0  1963150.5  42536.0 41906.5
4904.0 4742.5    158    6.036  80     58.024   64.060
> 349184.0 349184.0  0.0   63008.0 350208.0 56631.7  2097152.0  1942708.7  42536.0 41916.8
4904.0 4742.5    177    6.751  84     60.799   67.550
> 345600.0 349184.0 263005.5  0.0   350208.0 55075.8  2097152.0  1883022.2  42536.0 41925.8
4904.0 4742.5    192    7.295  88     65.144   72.439
> 349184.0 344064.0 139656.9  0.0   355328.0   0.0    2097152.0  2070558.8  42536.0 41935.4
4904.0 4742.5    204    7.803  95     68.618   76.421
> 349184.0 349184.0 63008.0  0.0   350208.0   0.0    2097152.0  1963035.3  42536.0 41971.9
4904.0 4742.5    218    8.339  102    72.534   80.873
> 349184.0 349184.0  0.0   17984.0 350208.0 253707.1 2097152.0  1848396.0  42536.0 41996.1
4904.0 4742.5    229    8.686  108    76.526   85.212
> 349184.0 324608.0  0.0   284036.3 351744.0   0.0    2097152.0  1776761.8  42536.0 42021.6
4904.0 4742.5    247    9.320  113    79.669   88.989
> 349184.0 349184.0 38510.2  0.0   350208.0 160454.7 2097152.0  1896405.6  42536.0 42036.0
4904.0 4742.5    256    9.555  115    80.723   90.278
> 342528.0 333824.0 147271.1  0.0   358912.0 312219.3 2097152.0  1882525.2  42792.0 42065.0
4904.0 4742.5    278   10.620  116    81.143   91.763
> 349184.0 283648.0  0.0   283645.6 370176.0 249240.7 2097152.0  1893727.0  42792.0 42066.5
4904.0 4742.5    303   11.738  117    81.684   93.422
> 349184.0 349184.0 49417.0  0.0   350208.0 172262.6 2097152.0  1127855.3  42792.0 42071.0
4904.0 4742.5    326   12.787  119    82.895   95.682
> 349184.0 349184.0  0.0   131245.0 350208.0 133019.3 2097152.0  1074752.6  42792.0 42081.4
4904.0 4742.5    349   13.943  120    83.448   97.391
> {code}
> Data loader client details for non-txn table: ~12sec GC pause with non-transactional
table. Total time to load 1M rows ~1mins. ~2G physical mem used. Java XMX param was set to
3G.
> {code}
> JSTAT for data loader - 10 sec interval
>  S0C    S1C    S0U    S1U      EC       EU        OC         OU       MC     MU    CCSC
  CCSU   YGC     YGCT    FGC    FGCT     GCT   
> 24576.0 28160.0  0.0    0.0   195584.0 28008.1   349696.0   25257.3   35496.0 35020.3
4264.0 4037.5      6    0.056   2      0.071    0.128
> 79872.0 79872.0  0.0   79860.8 888832.0   0.0     753664.0   753092.4  41000.0 40361.6
4648.0 4563.6     17    0.616   3      0.071    0.688
> 259584.0 260096.0 129075.4 161535.6 528384.0 528384.0  937472.0   767859.8  41000.0 40458.7
4648.0 4564.6     31    1.319   4      1.622    2.941
> 283136.0 278016.0 23716.8  0.0   487424.0 143734.6 1498624.0   662740.5  41000.0 40479.8
4648.0 4564.6     48    2.459   7      3.122    5.582
> 240640.0 251392.0 162230.5 31674.1 545792.0 545792.0 1530368.0   642312.2  41000.0 40531.4
4648.0 4567.6     69    3.733   9      4.069    7.802
> 289280.0 282624.0 19730.4  0.0   468480.0 403700.7 1786880.0   672352.5  41256.0 40645.2
4648.0 4567.6     86    4.974  11      5.160   10.134
> 282624.0 290816.0 163586.1  0.0   466944.0 268299.4 1950208.0  1491203.8  41256.0 40667.0
4648.0 4567.6    108    6.288  12      5.723   12.011
> {code}
> * TM uses ~4G physical memory (there's no GC pause for TM)
> * After restarting TM, TM memory usage drops to 0.5G and data loader GC pauses decrease
to ~15sec for transactional table and 1M gets loaded in ~1min.
> This hints to possible leak in TM. Since this test was on a remote secure cluster, I
haven't been able to profile it so I'll try to repro it locally to get more info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message