hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jean-Daniel Cryans (JIRA)" <j...@apache.org>
Subject [jira] Created: (HBASE-2070) Collect HLogs and delete them after a period of time
Date Wed, 23 Dec 2009 21:42:29 GMT
Collect HLogs and delete them after a period of time

                 Key: HBASE-2070
                 URL: https://issues.apache.org/jira/browse/HBASE-2070
             Project: Hadoop HBase
          Issue Type: New Feature
            Reporter: Jean-Daniel Cryans
            Assignee: Jean-Daniel Cryans
             Fix For: 0.21.0

For replication we need to be able to service clusters that are a few hours behind in edits.
For example, after distcp'ing a snapshot of the DB to another cluster, we need to make sure
we get the edits that came in after the snapshot was taken.

I plan the following changes:
- Instead of deleting HLogs during a log roll or after a log split, move them to another folder
where all logs should be aggregated.
- Add a new configuration for how old a log can be. For a normal cluster I think of a default
of 2 hours. For replication you may want to set it much higher.
- Create a new thread in the master that checks for logs older than configured time and that
deletes them.

I also fancy having the deletion time to be configurable while the cluster is running. I'm
also thinking of adding a way to tell the cluster to replay edits on itself.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message