hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: Analyzing output of Jenkins build Was: Jenkins build is back to normal : HBase-TRUNK #2304
Date Mon, 10 Oct 2011 04:24:54 GMT
That jstack just looks like the trace of the maven process - there
should be another JVM which is actually running the tests.

-Todd

On Sat, Oct 8, 2011 at 10:14 AM, Li Pi <lpi@ucsd.edu> wrote:
> I got the thing to fail on my vmware box. Heres the stack trace.
>
> Doesn't look like the cache itself is hanging. The 4 runnable threads:
>
> "Attach Listener" daemon prio=10 tid=0x0000000001c48000 nid=0x4cac
> waiting on condition [0x0000000000000000]
>   java.lang.Thread.State: RUNNABLE
>
> "Thread-5" prio=10 tid=0x00007fb714117800 nid=0x4c03 runnable
> [0x00007fb720a1e000]
>   java.lang.Thread.State: RUNNABLE
>        at java.io.FileInputStream.readBytes(Native Method)
>        at java.io.FileInputStream.read(FileInputStream.java:236)
>        at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:282)
>        at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:324)
>        at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:176)
>        - locked <0x00000000f20403b0> (a java.io.InputStreamReader)
>        at java.io.InputStreamReader.read(InputStreamReader.java:184)
>        at java.io.BufferedReader.fill(BufferedReader.java:153)
>        at java.io.BufferedReader.readLine(BufferedReader.java:316)
>        - locked <0x00000000f20403b0> (a java.io.InputStreamReader)
>        at java.io.BufferedReader.readLine(BufferedReader.java:379)
>        at org.codehaus.plexus.util.cli.StreamPumper.run(StreamPumper.java:129)
>
> "Thread-4" prio=10 tid=0x00007fb714114800 nid=0x4c01 runnable
> [0x00007fb720e36000]
>   java.lang.Thread.State: RUNNABLE
>        at java.io.FileInputStream.readBytes(Native Method)
>        at java.io.FileInputStream.read(FileInputStream.java:236)
>        at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
>        at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>        - locked <0x00000000f25c6ce8> (a java.io.BufferedInputStream)
>        at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:282)
>        at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:324)
>        at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:176)
>        - locked <0x00000000f203d858> (a java.io.InputStreamReader)
>        at java.io.InputStreamReader.read(InputStreamReader.java:184)
>        at java.io.BufferedReader.fill(BufferedReader.java:153)
>        at java.io.BufferedReader.readLine(BufferedReader.java:316)
>        - locked <0x00000000f203d858> (a java.io.InputStreamReader)
>        at java.io.BufferedReader.readLine(BufferedReader.java:379)
>        at org.codehaus.plexus.util.cli.StreamPumper.run(StreamPumper.java:129)
>
> "process reaper" daemon prio=10 tid=0x00007fb71401e800 nid=0x4bfe
> runnable [0x00007fb720c34000]
>   java.lang.Thread.State: RUNNABLE
>        at java.lang.UNIXProcess.waitForProcessExit(Native Method)
>        at java.lang.UNIXProcess.access$900(UNIXProcess.java:36)
>        at java.lang.UNIXProcess$1$1.run(UNIXProcess.java:148)
>
>
> Looks like fileInputStream.readBytes() is blocking.
>
>
> On Sat, Oct 8, 2011 at 10:04 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>> Scott:
>> Do you have time to write a script for analyzing output of Jenkins and put
>> it on HBASE-4480 ?
>> Here is some idea from Ramkrishna:
>>
>> All statements that has Running in it can be parsed to see if the every next
>> Running happens after one hop.
>> Like if the first Running happens to be in 11th line the next Running should
>> be in 13th.
>> If this breaks some where then that test is hanging.
>> This is just one idea. If we can figure out something better we can take it
>> up.
>>
>> Cheers
>>
>> On Sat, Oct 8, 2011 at 9:53 AM, Jesse Yates <jesse.k.yates@gmail.com> wrote:
>>
>>> The script to do this was written in 4480. Just needs some +1s a
>>> - It works pretty well.
>>>
>>> We might want to also mod it to take in a file that is the output of a run
>>> and analyze that.
>>>
>>> - Jesse Yates
>>>
>>> Sent from my iPhone.
>>>
>>> On Oct 8, 2011, at 2:51 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>>>
>>> > Parsing test output will do.
>>> >
>>> >
>>> >
>>> > On Oct 7, 2011, at 11:44 PM, Akash Ashok <thehellmaker@gmail.com>
wrote:
>>> >
>>> >> Hi Ted & Ram
>>> >>
>>> >> Just Figured out the hung test case both in
>>> >>
>>> >>
>>> https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2303/console
>>> >>
>>> https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2304/console
>>> >>
>>> >> Running org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache
>>> >> Running org.apache.hadoop.hbase.io.hfile.TestFixedFileTrailer
>>> >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.858
>>> sec
>>> >>
>>> >> TestSlabCache is the culprit
>>> >>
>>> >> Just copied into noteped++ and searched for running and it highlighted
>>> it
>>> >> and it was easier to find  :)
>>> >>
>>> >> And about the script. Is the idea to parse this output and figure out
>>> the
>>> >> hung test case or is there a plan to parse the surefire reports xml?
>>> >>
>>> >> Cheers,
>>> >> Akash A
>>> >>
>>> >> On Sat, Oct 8, 2011 at 11:13 AM, Ted Yu <yuzhihong@gmail.com>
wrote:
>>> >>
>>> >>> Yeah we need such script.
>>> >>> I went over the tests in
>>> >>>
>>> https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2303/console
>>> >>> and couldn't find out the hanging test.
>>> >>>
>>> >>> Cheers
>>> >>>
>>> >>> On Fri, Oct 7, 2011 at 10:33 PM, Ramakrishna S Vasudevan 00902313
<
>>> >>> ramakrishnas@huawei.com> wrote:
>>> >>>
>>> >>>> Ted
>>> >>>>
>>> >>>> Once we were already discussing regarding some script to find
out some
>>> >>> hung
>>> >>>> tests?
>>> >>>>
>>> >>>> Regards
>>> >>>> Ram
>>> >>>>
>>> >>>>
>>> >>>> ----- Original Message -----
>>> >>>> From: Ted Yu <yuzhihong@gmail.com>
>>> >>>> Date: Saturday, October 8, 2011 10:58 am
>>> >>>> Subject: Re: Jenkins build is back to normal : HBase-TRUNK #2304
>>> >>>> To: dev@hbase.apache.org
>>> >>>>
>>> >>>>> From
>>> >>>>> https://builds.apache.org/view/G-L/view/HBase/job/HBase-
>>> >>>>> TRUNK/2303/console,it wasn't obvious which test(s) hung.
>>> >>>>> But the following error clearly indicated there was some
hanging Java
>>> >>>>> process:
>>> >>>>>
>>> >>>>> [ERROR] Failed to execute goal
>>> >>>>> org.apache.maven.plugins:maven-surefire-plugin:2.9:test
>>> (default-test)
>>> >>>>> on project hbase: Failure or timeout -> [Help
>>> >>>>> 1]org.apache.maven.lifecycle.LifecycleExecutionException:
Failed to
>>> >>>>> execute goal org.apache.maven.plugins:maven-surefire-plugin:2.9:test
>>> >>>>> (default-test) on project hbase: Failure or timeout
>>> >>>>>
>>> >>>>> Unluckily we don't have access to the build machine.
>>> >>>>>
>>> >>>>> On Fri, Oct 7, 2011 at 10:14 PM, Akash Ashok
>>> >>>>> <thehellmaker@gmail.com> wrote:
>>> >>>>>
>>> >>>>>> Oh cool. Build is back to normal. Could someone tell
me what the
>>> >>>>> issue was.
>>> >>>>>> Why was it failing even though there were no failures
?
>>> >>>>>>
>>> >>>>>> On Sat, Oct 8, 2011 at 4:45 AM, Apache Jenkins Server
<
>>> >>>>>> jenkins@builds.apache.org> wrote:
>>> >>>>>>
>>> >>>>>>> See <https://builds.apache.org/job/HBase-TRUNK/2304/>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>
>>> >>>>>
>>> >>>>
>>> >>>
>>>
>>
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Mime
View raw message