hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14772) Improve zombie detector; be more discerning
Date Fri, 04 Dec 2015 03:33:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15039695#comment-15039695
] 

Hudson commented on HBASE-14772:
--------------------------------

FAILURE: Integrated in HBase-Trunk_matrix #529 (See [https://builds.apache.org/job/HBase-Trunk_matrix/529/])
 HBASE-14772 Improve zombie detector; be more discerning; part2; (stack: rev 5e430837d3e4a7d159e84964357297c8ab42430d)
* dev-support/test-patch.sh
* dev-support/zombie-detector.sh
 HBASE-14772 Improve zombie detector; be more discerning; part2; (stack: rev 7117a2e35d42ef4e3f17b0a8f891fc5200cd0890)
* dev-support/zombie-detector.sh


> Improve zombie detector; be more discerning
> -------------------------------------------
>
>                 Key: HBASE-14772
>                 URL: https://issues.apache.org/jira/browse/HBASE-14772
>             Project: HBase
>          Issue Type: Sub-task
>          Components: test
>            Reporter: stack
>            Assignee: stack
>             Fix For: 2.0.0
>
>         Attachments: 14772v3.patch, zombie.patch, zombiev2.patch
>
>
> Currently, any surefire process with the hbase flag is a potential zombie. Our zombie
check currently takes a reading and if it finds candidate zombies, it waits 30 seconds and
then does another reading. If a concurrent build going on, in both cases the zombie detector
will come up positive though the adjacent test run may be making progress; i.e. the cast of
surefire processes may have changed between readings but our detector just sees presence of
 hbase surefire processes.
> Here is example:
> {code}
> Suspicious java process found - waiting 30s to see if there are just slow to stop
> There appear to be 5 zombie tests, they should have been killed by surefire but survived
> 12823 surefirebooter852180186418035480.jar -enableassertions -Dhbase.test -Xmx2800m -XX:MaxPermSize=256m
-Djava.security.egd=file:/dev/./urandom -Djava.net.preferIPv4Stack=true -Djava.awt.headless=true
> 7653 surefirebooter8579074445899448699.jar -enableassertions -Dhbase.test -Xmx2800m -XX:MaxPermSize=256m
-Djava.security.egd=file:/dev/./urandom -Djava.net.preferIPv4Stack=true -Djava.awt.headless=true
> 12614 surefirebooter136529596936417090.jar -enableassertions -Dhbase.test -Xmx2800m -XX:MaxPermSize=256m
-Djava.security.egd=file:/dev/./urandom -Djava.net.preferIPv4Stack=true -Djava.awt.headless=true
> 7836 surefirebooter3217047564606450448.jar -enableassertions -Dhbase.test -Xmx2800m -XX:MaxPermSize=256m
-Djava.security.egd=file:/dev/./urandom -Djava.net.preferIPv4Stack=true -Djava.awt.headless=true
> 13566 surefirebooter2084039411151963494.jar -enableassertions -Dhbase.test -Xmx2800m
-XX:MaxPermSize=256m -Djava.security.egd=file:/dev/./urandom -Djava.net.preferIPv4Stack=true
-Djava.awt.headless=true
> ************ BEGIN zombies jstack extract
> ************ END  zombies jstack extract
> {code}
> 5 is the number of forked processes we allow when doing medium and large tests.... so
an adjacent build will always show as '5 zombies'.
> Need to add discerning if list of processes changes between readings.
> Can I also add a tag per build run that all forked processes pick up so I can look at
the current builds progeny only?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message