bookkeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sijie Guo <si...@apache.org>
Subject Re: Ledgers failing to replicate
Date Wed, 11 Jan 2017 20:04:50 GMT
On Wed, Jan 11, 2017 at 11:15 AM, Sebastián Schepens <
sebastian.schepens@mercadolibre.com> wrote:

> Hi guys,
> I'm doing some tests and turned off 2 bookies almost simultaneously hoping
> that all the ledgers would still be able to replicate since we have
> ensemble and quorum size of 3.
> Almost all ledgers managed to replicate using the autorecovery daemon
> except for 5. What's curious about this 5 ledgers is that they are all
> empty and the only node which contains data for it claims it does not exist.
>
> Here's the ledger metadata for one of them:
> ledgerID: 772
> BookieMetadataFormatVersion 2
> quorumSize: 3
> ensembleSize: 3
> length: 0
> lastEntryId: -1
> state: IN_RECOVERY
> segment {
>   ensembleMember: "10.64.103.57:3181"
>   ensembleMember: "10.64.103.249:3181"
>   ensembleMember: "10.64.102.95:3181"
>   firstEntryId: 0
> }
> digestType: CRC32
> password: ""
> ackQuorumSize: 2
>
> Where all nodes except 10.64.103.249 are down.
>
> And that node contains these logs:
> ERROR - [BookieReadThread-3181-10-1:ReadEntryProcessorV3@123] - No ledger
> found while reading entry:-1 from ledger: 772
>

They seem to be empty ledgers with no entries.


>
> I don't understand how these ledgers ended in this state, is it
> recoverable?
>

If the ledgers are closed, if you lose two bookies, the re-replication can
replicate the data correctly. As when the ledger is in closed state, it
will contains the last entry id in the metadata, it would use the
information to determine the state of the ledger and replicate data
correctly.

However, if the ledgers are open and you lost two bookies (which is the
majority of your quorum), the client can't make a decision what is the last
entry id based on only one left bookie, so it can not close/seal the ledger
correctly.

Can you explain more about your tests? It would help me understand more
about that.


>
> I could just delete the ledgers cause they are empty too. By the way,
> bookkeeper shell should have a command for deleting ledgers.
>

Yeah, this is a good suggestion. Do you mind creating a jira for adding the
delete ledger command?


>
> Thanks,
> Sebastian
>

Mime
View raw message