hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars George (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-3890) Scheduled tasks in distributed log splitting not in sync with ZK
Date Mon, 16 May 2011 20:38:47 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034291#comment-13034291
] 

Lars George commented on HBASE-3890:
------------------------------------

Hi Prakash, thanks for the input! I think though this is unrelated, as this happened after
the patch and restart. The messages should all have been the replay of the recovered logs.
They do not show up since the first few will make them drop from the TaskMonitor because of
the reuse.

I am not sure where and how this errs, and if it does at all. But I got those "leaked" log
hints and the UI did not show any running tasks as it should have. So something is amiss,
but I still need to check what is wrong.

> Scheduled tasks in distributed log splitting not in sync with ZK
> ----------------------------------------------------------------
>
>                 Key: HBASE-3890
>                 URL: https://issues.apache.org/jira/browse/HBASE-3890
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.92.0
>            Reporter: Lars George
>             Fix For: 0.92.0
>
>
> This is in continuation to HBASE-3889:
> Note that there must be more slightly off here. Although the splitlogs znode is now empty
the master is still stuck here:
> {noformat}
> Doing distributed log split in hdfs://localhost:8020/hbase/.logs/10.0.0.65,60020,1305406356765

> - Waiting for distributed tasks to finish. scheduled=2 done=1 error=0   4380s
> Master startup	
> - Splitting logs after master startup   4388s
> {noformat}
> There seems to be an issue with what is in ZK and what the TaskBatch holds. In my case
it could be related to the fact that the task was already in ZK after many faulty restarts
because of the NPE. Maybe it was added once (since that is keyed by path, and that is unique
on my machine), but the reference count upped twice? Now that the real one is done, the done
counter has been increased, but will never match the scheduled.
> The code could also check if ZK is actually depleted, and therefore treat the scheduled
task as bogus? This of course only treats the symptom, not the root cause of this condition.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message