hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-1000) Sleeper.sleep does not go back to sleep when interrupted and no stop flag given.
Date Wed, 03 Dec 2008 23:11:44 GMT

     [ https://issues.apache.org/jira/browse/HBASE-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack updated HBASE-1000:
-------------------------

    Fix Version/s: 0.19.0

Adding to 0.19.0.  What I saw was as follows:

{code}
2008-12-03 22:54:53,507 [regionserver/0:0:0:0:0:0:0:0:60020.majorCompactionChecker] ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker: Caught exception
java.nio.channels.ClosedSelectorException
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source)
        at sun.nio.ch.SelectorImpl.selectNow(Unknown Source)
        at sun.nio.ch.Util.releaseTemporarySelector(Unknown Source)
        at sun.nio.ch.SocketAdaptor.connect(Unknown Source)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:299)
        at org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:772)
        at org.apache.hadoop.ipc.Client.call(Client.java:685)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
        at $Proxy1.getListing(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at $Proxy1.getListing(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:569)
        at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:226)
        at org.apache.hadoop.hbase.regionserver.HStore.getLowestTimestamp(HStore.java:783)
        at org.apache.hadoop.hbase.regionserver.HStore.isMajorCompaction(HStore.java:975)
        at org.apache.hadoop.hbase.regionserver.HStore.isMajorCompaction(HStore.java:963)
        at org.apache.hadoop.hbase.regionserver.HRegion.isMajorCompaction(HRegion.java:2424)
        at org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker.chore(HRegionServer.java:735)
        at org.apache.hadoop.hbase.Chore.run(Chore.java:65)
2008-12-03 22:54:53,509 [regionserver/0:0:0:0:0:0:0:0:60020.majorCompactionChecker] ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker: Caught exception
java.lang.NullPointerException
        at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:459)
        at org.apache.hadoop.ipc.Client.call(Client.java:686)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
        at $Proxy1.getListing(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at $Proxy1.getListing(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:569)
        at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:226)
        at org.apache.hadoop.hbase.regionserver.HStore.getLowestTimestamp(HStore.java:783)
        at org.apache.hadoop.hbase.regionserver.HStore.isMajorCompaction(HStore.java:975)
        at org.apache.hadoop.hbase.regionserver.HStore.isMajorCompaction(HStore.java:963)
        at org.apache.hadoop.hbase.regionserver.HRegion.isMajorCompaction(HRegion.java:2424)
        at org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker.chore(HRegionServer.java:735)
        at org.apache.hadoop.hbase.Chore.run(Chore.java:65)
...
{code}

Kept doing above exception without a pause between.

Root issue was an OOME but nonetheless, should pause between.

> Sleeper.sleep does not go back to sleep when interrupted and no stop flag given.
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-1000
>                 URL: https://issues.apache.org/jira/browse/HBASE-1000
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: util
>            Reporter: Nitay Joffe
>            Priority: Trivial
>             Fix For: 0.19.0
>
>         Attachments: hbase-1000.patch
>
>
> When interrupted, the Sleeper.sleep method should exit if the stop flag was given, otherwise
it should continue sleeping. Currently it seems to exits regardless, which means the stop
flag is meaningless.
> Here is the relevant code, from src/java/org/apache/hadoop/hbase/util/Sleeper.java:
> {code}
>   public void sleep(final long startTime) {
>     if (this.stop.get()) {
>       return;
>     }
>     long now = System.currentTimeMillis();
>     long waitTime = this.period - (now - startTime);
>     if (waitTime > this.period) {
>       LOG.warn("Calculated wait time > " + this.period +
>         "; setting to this.period: " + System.currentTimeMillis() + ", " +
>         startTime);
>     }
>     if (waitTime > 0) {
>       try {
>         Thread.sleep(waitTime);
>         long slept = System.currentTimeMillis() - now;
>         if (slept > (10 * this.period)) {
>           LOG.warn("We slept " + slept + "ms, ten times longer than scheduled: " +
>             this.period);
>         }
>       } catch(InterruptedException iex) {
>         // We we interrupted because we're meant to stop?  If not, just
>         // continue ignoring the interruption
>         if (this.stop.get()) {
>           return;
>         }
>       }
>     }
>   }
> {code}
> Essentially, the 'if (waitTime > 0)' portion needs to change to a while loop so that
the sleeping will continue after an interruption occurs. I'll attach a patch when I get around
to fixing it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message