hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Demai Ni (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-9047) Tool to handle finishing replication when the cluster is offline
Date Fri, 06 Dec 2013 19:41:36 GMT

    [ https://issues.apache.org/jira/browse/HBASE-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841595#comment-13841595
] 

Demai Ni commented on HBASE-9047:
---------------------------------

I went through the two -1 and didn't find the relationship with this patch

for -1 javadoc. The javadoc tool appears to have generated 1 warning messages. Besides the
existing (and known) Bytes.java about sun.misc.Unsafe, there is another one  
{code}
src/main/java/org/apache/hadoop/hbase/mapreduce/LabelExpander.java:188: warning - @return
tag has no arguments
{code}

for -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. I
also didn't find anything associated with this patch. But there is one related with replicationsource.java
{code}
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.processEndOfFile() might
ignore java.io.IOException
	

Bug type DE_MIGHT_IGNORE (click for details)
In class org.apache.hadoop.hbase.replication.regionserver.ReplicationSource
In method org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.processEndOfFile()
Exception class java.io.IOException
At ReplicationSource.java:[line 736]
At ReplicationSource.java:[line 736]

	try {
          FileStatus stat = this.fs.getFileStatus(this.currentPath);
          filesize = stat.getLen()+"";
        } catch (IOException ex) {} <--- error
        LOG.trace("Reached the end of a log, stats: " + getStats() +
            ", and the length of the file is " + filesize);
{code}

I will do some search first, and open separated Jira for above two issues if jira hasn't been
opened yet.

Demai

> Tool to handle finishing replication when the cluster is offline
> ----------------------------------------------------------------
>
>                 Key: HBASE-9047
>                 URL: https://issues.apache.org/jira/browse/HBASE-9047
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 0.96.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Demai Ni
>             Fix For: 0.98.0
>
>         Attachments: HBASE-9047-0.94.9-v0.PATCH, HBASE-9047-trunk-v0.patch, HBASE-9047-trunk-v1.patch,
HBASE-9047-trunk-v2.patch, HBASE-9047-trunk-v3.patch, HBASE-9047-trunk-v4.patch, HBASE-9047-trunk-v4.patch,
HBASE-9047-trunk-v5.patch
>
>
> We're having a discussion on the mailing list about replicating the data on a cluster
that was shut down in an offline fashion. The motivation could be that you don't want to bring
HBase back up but still need that data on the slave.
> So I have this idea of a tool that would be running on the master cluster while it is
down, although it could also run at any time. Basically it would be able to read the replication
state of each master region server, finish replicating what's missing to all the slave, and
then clear that state in zookeeper.
> The code that handles replication does most of that already, see ReplicationSourceManager
and ReplicationSource. Basically when ReplicationSourceManager.init() is called, it will check
all the queues in ZK and try to grab those that aren't attached to a region server. If the
whole cluster is down, it will grab all of them.
> The beautiful thing here is that you could start that tool on all your machines and the
load will be spread out, but that might not be a big concern if replication wasn't lagging
since it would take a few seconds to finish replicating the missing data for each region server.
> I'm guessing when starting ReplicationSourceManager you'd give it a fake region server
ID, and you'd tell it not to start its own source.
> FWIW the main difference in how replication is handled between Apache's HBase and Facebook's
is that the latter is always done separately of HBase itself. This jira isn't about doing
that.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message