hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1924) [hbase] TestDFSAbort failed in nightly #242
Date Fri, 05 Oct 2007 17:54:50 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532737
] 

Doug Cutting commented on HADOOP-1924:
--------------------------------------

It seems that setSoTimeout only affects reads, not writes.  So you're right, there is no way
in Java to set a write timeout!

One theory is that the server, while stopped, may not in fact be closing all its connections.
 I couldn't see where that was done just now when I looked.  Handlers are interrupted, but
they don't close their connection on interrupt.  The listener thread calls cleanupConnections(true),
but only on OutOfMemoryException, not in a 'finally' clause.  And, even then cleanupConnections(true)
doesn't look like it closes connections that have been recently active.

So please check the server's logs to see if for each "Server connection from" line there is
a corresponding "disconnecting client" line.  If there's not, then this could be the problem.

Some potentially relevant discussions:

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4283017
http://forum.java.sun.com/thread.jspa?threadID=5203832&tstart=75
http://www-1.ibm.com/support/docview.wss?rs=180&uid=swg1PK37506
http://archives.java.sun.com/cgi-bin/wa?A2=ind0212&L=rmi-users&P=731

Other possible things to try:
- call Socket#setKeepAlive() on IPC sockets
- try calling Thread#interrrupt(), it may help...
- adjust some of the TCP parameters on lucene.zones.apache.org


> [hbase] TestDFSAbort failed in nightly #242
> -------------------------------------------
>
>                 Key: HADOOP-1924
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1924
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>            Priority: Minor
>         Attachments: testdfsabort.patch, testdfsabort_patchbuild798.txt
>
>
> TestDFSAbort and TestBloomFilters failed in last nights nightly build (#242).  This issue
is about trying to figure whats up w/ TDFSA.
> Studying console logs, HRegionServer stopped logging any activity and HMaster for its
part did not expire the HRegionServer lease.  On top of it all, continued tests of the state
of HDFS -- the test is meant to sure Hbase shutdown when HDFS is pulled from under it -- seems
to have continued reporting itself healthy though it'd be closed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message