hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Duo Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17633) Update unflushed sequence id in SequenceIdAccounting after flush with the minimum sequence id in memstore
Date Fri, 31 Mar 2017 01:31:41 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15950155#comment-15950155
] 

Duo Zhang commented on HBASE-17633:
-----------------------------------

Finally I found that if we keep min sequence id in memstore then we do not need SequenceIdAccounting
anymore(Of course the highestSequenceIds is still needed but it does not worth to have a separated
class for it)... WAL could go to memstore to find out if it can safely purge an old wal file.

Will give a try soon.

> Update unflushed sequence id in SequenceIdAccounting after flush with the minimum sequence
id in memstore
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-17633
>                 URL: https://issues.apache.org/jira/browse/HBASE-17633
>             Project: HBase
>          Issue Type: Improvement
>          Components: wal
>    Affects Versions: 2.0.0
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>             Fix For: 2.0.0
>
>         Attachments: HBASE-17633.patch, HBASE-17633-v1.patch
>
>
> Now the tracking work is done by SequenceIdAccounting. And it is a little tricky when
dealing with flush. We should remove the mapping for the given stores of a region from lowestUnflushedSequenceIds,
so that we have space to store the new lowest unflushed sequence id after flush. But we still
need to keep the old sequence ids in another map as we still need to use these values when
reporting to master to prevent data loss(think of the scenario that we report the new lowest
unflushed sequence id to master and we crashed before actually flushed the data to disk).
> And when reviewing HBASE-17407, I found  that for CompactingMemStore, we have to record
the minimum sequence id.in memstore. We could just update the mappings in SequenceIdAccounting
using these values after flush. This means we do not need to update the lowest unflushed sequence
id in SequenceIdAccounting, and also do not need to make space for the new lowest unflushed
when startCacheFlush, and also do not need the extra map to store the old mappings.
> This could simplify our logic a lot. But this is a fundamental change so I need sometime
to implement, especially for modifying tests... And I also need sometime to check if I miss
something.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message