hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Duo Zhang (Jira)" <j...@apache.org>
Subject [jira] [Reopened] (HBASE-25065) WAL archival to be done by a separate thread
Date Tue, 13 Oct 2020 07:59:00 GMT

     [ https://issues.apache.org/jira/browse/HBASE-25065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Duo Zhang reopened HBASE-25065:

This breaks TestMasterRegionOnTwoFileSystems.testFlushAndCompact.

> WAL archival to be done by a separate thread
> --------------------------------------------
>                 Key: HBASE-25065
>                 URL: https://issues.apache.org/jira/browse/HBASE-25065
>             Project: HBase
>          Issue Type: Improvement
>          Components: wal
>    Affects Versions: 3.0.0-alpha-1, 2.4.0
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>            Priority: Major
>             Fix For: 3.0.0-alpha-1, 2.4.0
> Currently we do clean up of logs once we ensure that the region data has been flushed.
We track the sequence number and if we ensure that the seq number has been flushed for any
given region and the WAL that was rolled has that seq number then those WAL can be archived.
> When we have around ~50 files to archive (per RS) - we do the archiving one after the
other. Since archiving is nothing but a rename operation it adds to the meta operation load
of Cloud based FS. 
> Not only that - the entire archival is done inside the rollWriterLock. Though we have
closed the writer and created a new writer and the writes are ongoing - we never release the
lock until we are done with the archiving. 
> What happens is that during that period our logs grow in size compared to the default
size configured (when we have consistent writes happening). 
> So the proposal is to move the log archival to a seperate thread and ensure we can do
some kind of throttling or batching so that we don't do archival at one shot. 

This message was sent by Atlassian Jira

View raw message