hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ramkrishna S Vasudevan <ramkrishna.vasude...@huawei.com>
Subject RE: Status of 0.92RC
Date Mon, 28 Nov 2011 04:37:58 GMT
Dear Lars

It was a nice learning.  Really appreciate for making things clear.

Thanks to Todd for the same. :)

Regards
Ram



-----Original Message-----
From: lars hofhansl [mailto:lhofhansl@yahoo.com] 
Sent: Monday, November 28, 2011 5:08 AM
To: Todd Lipcon; dev@hbase.apache.org
Subject: Re: Status of 0.92RC

Just committed HBASE-4874.
Even though file:///dev/urandom worked in my tests as well ended up leaving
it at file:/dev/./urandom.
A bit of googling brought up *many* references for file:/dev/./urandom
(including the Sun bug below) and only one or two for file:///dev/urandom.

I hope that this will generally help with hanging tests. TestHCM does not
use SecureRandom, but TestHCM.testConnectionUniqueness()
is generating many random ints, thereby exhausting the systems entropy
reservoirs. So any test following TestHCM could hang when
it needs a secure random number (such as UUID.randomUUID(), which is used to
generated a new cluster UUID for the mini clusters).


-- Lars

________________________________
From: Todd Lipcon <todd@cloudera.com>
To: dev@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com> 
Sent: Saturday, November 26, 2011 9:09 PM
Subject: Re: Status of 0.92RC

On Sat, Nov 26, 2011 at 5:10 PM, lars hofhansl <lhofhansl@yahoo.com> wrote:
> Are you sure this works?
>
> According to this:
http://bugs.sun.com/view_bug.do;jsessionid=ff625daf459fdffffffffcd54f1c77529
9e0?bug_id=6202721
> the JDK will treat /dev/urandom as a special string and use /dev/random
anyway (hence the workaround with /dev/./urandom)

It worked for me so long as I had enough slashes - file:///dev/urandom
and not file:/dev/urandom as some other resources showed.

-Todd

>
>
> ----- Original Message -----
> From: Todd Lipcon <todd@cloudera.com>
> To: dev@hbase.apache.org
> Cc: lars hofhansl <lhofhansl@yahoo.com>
> Sent: Friday, November 25, 2011 10:22 PM
> Subject: Re: Status of 0.92RC
>
> In Hadoop we use this in our pom:
>             <java.security.egd>file:///dev/urandom</java.security.egd>
>
> which also works in JDK6. See HADOOP-7841
>
>
> On Fri, Nov 25, 2011 at 10:12 PM, Li Pi <li@idle.li> wrote:
>> Very, very nice discovery!
>>
>> Once I saw the move your mouse around part, I smiled.
>>
>> Entropy can also be gathered from other system events, such as network
>> traffic, so on a production machine, you should always have enough
>> entropy. So it should be fine to change the test to use urandom.
>>
>> On Fri, Nov 25, 2011 at 9:06 PM, lars hofhansl <lhofhansl@yahoo.com>
wrote:
>>> Here is something that will boggle your mind:
>>> After you ran TestHCM often enough it will hang until you move your
mouse around!
>>>
>>> No joke... But what sounds like an impossibility is actually related to
the use of SecureRandom.
>>> SecureRandom uses /dev/random on Linux and that gathers entropy from
system events
>>> such as network activity, disk activity, keyboard activity, mouse
movement, etc.
>>> If not enough the entropy is available /dev/random will block until
enough entropy is gathered!
>>>
>>> When I add this to the command line (via pom.xml) TestHCM never hangs:
>>> "-Djava.security.egd=file:/dev/./urandom"
>>>
>>> (note that -Djava.security.egd=file:/dev/urandom does not work for some
reason on JDK 1.5 or newer,  the extra ./ is needed)
>>>
>>>
>>> This instructs the JDK use /dev/urandom for SecureRandom, which unlike
/dev/random will never block, but if not enough
>>> entropy exists it will generate random numbers of lesser quality... But
for tests we don't care.
>>>
>>>
>>> That explains why tests only time out sometimes... when the system
happens to run out of entropy bits when a secure random
>>> is used.
>>>
>>> -- Lars
>>>
>>>
>>> ________________________________
>>> From: Stack <stack@duboce.net>
>>> To: dev@hbase.apache.org
>>> Cc: lars hofhansl <lhofhansl@yahoo.com>
>>> Sent: Friday, November 25, 2011 7:14 PM
>>> Subject: Re: Status of 0.92RC
>>>
>>> On Fri, Nov 25, 2011 at 6:16 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>>>> I then looped TestHCM 4 times and there was no test failure.
>>>>
>>>
>>> Its fine on mac.  On ubuntu:
>>>
>>>
----------------------------------------------------------------------------
---
>>> Test set: org.apache.hadoop.hbase.client.TestHCM
>>>
----------------------------------------------------------------------------
---
>>> Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed:
>>> 757.578 sec <<< FAILURE!
>>> testClosing(org.apache.hadoop.hbase.client.TestHCM)  Time elapsed:
>>> 35.34 sec  <<< FAILURE!
>>> java.lang.AssertionError
>>>         at org.junit.Assert.fail(Assert.java:92)
>>>         at org.junit.Assert.assertTrue(Assert.java:43)
>>>         at org.junit.Assert.assertFalse(Assert.java:68)
>>>         at org.junit.Assert.assertFalse(Assert.java:79)
>>>         at
org.apache.hadoop.hbase.client.TestHCM.testClosing(TestHCM.java:221)
>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>
>>> ....
>>>
>>> Line numbers are off because I'm messing.  Its saying connection 1 is
>>> closed if I test it just after creating it.
>>>
>>> St.Ack
>>>
>>>> On Fri, Nov 25, 2011 at 5:39 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>>>>
>>>>> I looped TestHCM#testClosing 5 times on MacBook and didn't see test
>>>>> failure.
>>>>>
>>>>> Stack:
>>>>> Can you share the test output ?
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>> On Fri, Nov 25, 2011 at 5:04 PM, lars hofhansl
<lhofhansl@yahoo.com>wrote:
>>>>>
>>>>>> I added testClosing as part of HBASE-4805, I'll have a look as soon
as I
>>>>>> get a chance.
>>>>>>
>>>>>>
>>>>>>
>>>>>> ________________________________
>>>>>>  From: Stack <stack@duboce.net>
>>>>>> To: HBase Dev List <dev@hbase.apache.org>
>>>>>> Sent: Friday, November 25, 2011 2:12 PM
>>>>>> Subject: Status of 0.92RC
>>>>>>
>>>>>> I'm having a little difficulty getting all tests to pass.  On
>>>>>> linux/ubuntu, TestHCM (testClosing strange issue) and TestReplication
>>>>>> are failing for me.  On mac osx, it'll build without fail about
50%
of
>>>>>> the time.  I'd like to make it so tests pass all the time before
>>>>>> cutting the RC.  Thats what I'm at these times.
>>>>>>
>>>>>> Also, 0.92 build on jenkins has been turned off by Apache
>>>>>> Infrastructure.  It was hanging.  Its done this in the past too
and
>>>>>> when it hangs it requires a jenkins reboot which doesn't make Apache
>>>>>> Infrastructure team too happy.  The hang looks to me like a Jenkins
>>>>>> bug because build hangs before we even checkout src.  Am trying
to
see
>>>>>> what can be done to get it going again but thats the story at the
mo.
>>>>>>
>>>>>> St.Ack
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>
>



-- 
Todd Lipcon
Software Engineer, Cloudera


Mime
View raw message