hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Appy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-19715) Fix timing out test TestMultiRespectsLimits
Date Sat, 06 Jan 2018 01:47:03 GMT

    [ https://issues.apache.org/jira/browse/HBASE-19715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314278#comment-16314278
] 

Appy commented on HBASE-19715:
------------------------------

It's just crazy how we have ~6 million Object[] instances. 
And lots of instances are of size 32 containing just AsyncReqestFutureImpl.class (yes, the
Class object, see the common #1297 object id). It feels like it's some java internal array,
but i am sure there's something wrong going by my observation of browsing around heap dump
and seeing that repeat over and over.
!screenshot-4.png|width=800px!

> Fix timing out test TestMultiRespectsLimits
> -------------------------------------------
>
>                 Key: HBASE-19715
>                 URL: https://issues.apache.org/jira/browse/HBASE-19715
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Appy
>            Assignee: Appy
>         Attachments: failued.txt, passed.txt, screenshot-1.png, screenshot-2.png, screenshot-3.png,
screenshot-4.png
>
>
> !screenshot-1.png|width=800px!
> Attached logs for both cases, when it passes and fails.
> Link (temporary) to logs:
> passed: http://104.198.223.121:8080/job/HBase-Flaky-Tests/33449/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestMultiRespectsLimits-output.txt/*view*/
> failed: http://104.198.223.121:8080/job/HBase-Flaky-Tests/33455/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestMultiRespectsLimits-output.txt/*view*/
> Correlating across more runs, whenever the tests passes, it does so within 10-30sec of
3min deadline for medium tests.
> So i think we can make it pass by just increasing the timeout.
> But I'm a bit skeptical after seeing all those long GC pauses (10sec +) in the log. Test
code doesn't seem to be doing anything that intensive. Are we mismanaging the memory somewhere?




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message