hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Li Pi <...@ucsd.edu>
Subject Re: Analyzing output of Jenkins build Was: Jenkins build is back to normal : HBase-TRUNK #2304
Date Mon, 10 Oct 2011 04:31:29 GMT
Top showed only one java process. I'll try this again once I get back.

On Sun, Oct 9, 2011 at 9:24 PM, Todd Lipcon <todd@cloudera.com> wrote:
> That jstack just looks like the trace of the maven process - there
> should be another JVM which is actually running the tests.
>
> -Todd
>
> On Sat, Oct 8, 2011 at 10:14 AM, Li Pi <lpi@ucsd.edu> wrote:
>> I got the thing to fail on my vmware box. Heres the stack trace.
>>
>> Doesn't look like the cache itself is hanging. The 4 runnable threads:
>>
>> "Attach Listener" daemon prio=10 tid=0x0000000001c48000 nid=0x4cac
>> waiting on condition [0x0000000000000000]
>>   java.lang.Thread.State: RUNNABLE
>>
>> "Thread-5" prio=10 tid=0x00007fb714117800 nid=0x4c03 runnable
>> [0x00007fb720a1e000]
>>   java.lang.Thread.State: RUNNABLE
>>        at java.io.FileInputStream.readBytes(Native Method)
>>        at java.io.FileInputStream.read(FileInputStream.java:236)
>>        at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:282)
>>        at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:324)
>>        at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:176)
>>        - locked <0x00000000f20403b0> (a java.io.InputStreamReader)
>>        at java.io.InputStreamReader.read(InputStreamReader.java:184)
>>        at java.io.BufferedReader.fill(BufferedReader.java:153)
>>        at java.io.BufferedReader.readLine(BufferedReader.java:316)
>>        - locked <0x00000000f20403b0> (a java.io.InputStreamReader)
>>        at java.io.BufferedReader.readLine(BufferedReader.java:379)
>>        at org.codehaus.plexus.util.cli.StreamPumper.run(StreamPumper.java:129)
>>
>> "Thread-4" prio=10 tid=0x00007fb714114800 nid=0x4c01 runnable
>> [0x00007fb720e36000]
>>   java.lang.Thread.State: RUNNABLE
>>        at java.io.FileInputStream.readBytes(Native Method)
>>        at java.io.FileInputStream.read(FileInputStream.java:236)
>>        at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
>>        at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>>        - locked <0x00000000f25c6ce8> (a java.io.BufferedInputStream)
>>        at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:282)
>>        at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:324)
>>        at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:176)
>>        - locked <0x00000000f203d858> (a java.io.InputStreamReader)
>>        at java.io.InputStreamReader.read(InputStreamReader.java:184)
>>        at java.io.BufferedReader.fill(BufferedReader.java:153)
>>        at java.io.BufferedReader.readLine(BufferedReader.java:316)
>>        - locked <0x00000000f203d858> (a java.io.InputStreamReader)
>>        at java.io.BufferedReader.readLine(BufferedReader.java:379)
>>        at org.codehaus.plexus.util.cli.StreamPumper.run(StreamPumper.java:129)
>>
>> "process reaper" daemon prio=10 tid=0x00007fb71401e800 nid=0x4bfe
>> runnable [0x00007fb720c34000]
>>   java.lang.Thread.State: RUNNABLE
>>        at java.lang.UNIXProcess.waitForProcessExit(Native Method)
>>        at java.lang.UNIXProcess.access$900(UNIXProcess.java:36)
>>        at java.lang.UNIXProcess$1$1.run(UNIXProcess.java:148)
>>
>>
>> Looks like fileInputStream.readBytes() is blocking.
>>
>>
>> On Sat, Oct 8, 2011 at 10:04 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>>> Scott:
>>> Do you have time to write a script for analyzing output of Jenkins and put
>>> it on HBASE-4480 ?
>>> Here is some idea from Ramkrishna:
>>>
>>> All statements that has Running in it can be parsed to see if the every next
>>> Running happens after one hop.
>>> Like if the first Running happens to be in 11th line the next Running should
>>> be in 13th.
>>> If this breaks some where then that test is hanging.
>>> This is just one idea. If we can figure out something better we can take it
>>> up.
>>>
>>> Cheers
>>>
>>> On Sat, Oct 8, 2011 at 9:53 AM, Jesse Yates <jesse.k.yates@gmail.com> wrote:
>>>
>>>> The script to do this was written in 4480. Just needs some +1s a
>>>> - It works pretty well.
>>>>
>>>> We might want to also mod it to take in a file that is the output of a run
>>>> and analyze that.
>>>>
>>>> - Jesse Yates
>>>>
>>>> Sent from my iPhone.
>>>>
>>>> On Oct 8, 2011, at 2:51 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>>>>
>>>> > Parsing test output will do.
>>>> >
>>>> >
>>>> >
>>>> > On Oct 7, 2011, at 11:44 PM, Akash Ashok <thehellmaker@gmail.com>
wrote:
>>>> >
>>>> >> Hi Ted & Ram
>>>> >>
>>>> >> Just Figured out the hung test case both in
>>>> >>
>>>> >>
>>>> https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2303/console
>>>> >>
>>>> https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2304/console
>>>> >>
>>>> >> Running org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache
>>>> >> Running org.apache.hadoop.hbase.io.hfile.TestFixedFileTrailer
>>>> >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
1.858
>>>> sec
>>>> >>
>>>> >> TestSlabCache is the culprit
>>>> >>
>>>> >> Just copied into noteped++ and searched for running and it highlighted
>>>> it
>>>> >> and it was easier to find  :)
>>>> >>
>>>> >> And about the script. Is the idea to parse this output and figure
out
>>>> the
>>>> >> hung test case or is there a plan to parse the surefire reports
xml?
>>>> >>
>>>> >> Cheers,
>>>> >> Akash A
>>>> >>
>>>> >> On Sat, Oct 8, 2011 at 11:13 AM, Ted Yu <yuzhihong@gmail.com>
wrote:
>>>> >>
>>>> >>> Yeah we need such script.
>>>> >>> I went over the tests in
>>>> >>>
>>>> https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2303/console
>>>> >>> and couldn't find out the hanging test.
>>>> >>>
>>>> >>> Cheers
>>>> >>>
>>>> >>> On Fri, Oct 7, 2011 at 10:33 PM, Ramakrishna S Vasudevan 00902313
<
>>>> >>> ramakrishnas@huawei.com> wrote:
>>>> >>>
>>>> >>>> Ted
>>>> >>>>
>>>> >>>> Once we were already discussing regarding some script to
find out some
>>>> >>> hung
>>>> >>>> tests?
>>>> >>>>
>>>> >>>> Regards
>>>> >>>> Ram
>>>> >>>>
>>>> >>>>
>>>> >>>> ----- Original Message -----
>>>> >>>> From: Ted Yu <yuzhihong@gmail.com>
>>>> >>>> Date: Saturday, October 8, 2011 10:58 am
>>>> >>>> Subject: Re: Jenkins build is back to normal : HBase-TRUNK
#2304
>>>> >>>> To: dev@hbase.apache.org
>>>> >>>>
>>>> >>>>> From
>>>> >>>>> https://builds.apache.org/view/G-L/view/HBase/job/HBase-
>>>> >>>>> TRUNK/2303/console,it wasn't obvious which test(s) hung.
>>>> >>>>> But the following error clearly indicated there was
some hanging Java
>>>> >>>>> process:
>>>> >>>>>
>>>> >>>>> [ERROR] Failed to execute goal
>>>> >>>>> org.apache.maven.plugins:maven-surefire-plugin:2.9:test
>>>> (default-test)
>>>> >>>>> on project hbase: Failure or timeout -> [Help
>>>> >>>>> 1]org.apache.maven.lifecycle.LifecycleExecutionException:
Failed to
>>>> >>>>> execute goal org.apache.maven.plugins:maven-surefire-plugin:2.9:test
>>>> >>>>> (default-test) on project hbase: Failure or timeout
>>>> >>>>>
>>>> >>>>> Unluckily we don't have access to the build machine.
>>>> >>>>>
>>>> >>>>> On Fri, Oct 7, 2011 at 10:14 PM, Akash Ashok
>>>> >>>>> <thehellmaker@gmail.com> wrote:
>>>> >>>>>
>>>> >>>>>> Oh cool. Build is back to normal. Could someone
tell me what the
>>>> >>>>> issue was.
>>>> >>>>>> Why was it failing even though there were no failures
?
>>>> >>>>>>
>>>> >>>>>> On Sat, Oct 8, 2011 at 4:45 AM, Apache Jenkins Server
<
>>>> >>>>>> jenkins@builds.apache.org> wrote:
>>>> >>>>>>
>>>> >>>>>>> See <https://builds.apache.org/job/HBase-TRUNK/2304/>
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>
>>>> >>>>>
>>>> >>>>
>>>> >>>
>>>>
>>>
>>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Mime
View raw message