bookkeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Whitney Sorenson <wsoren...@hubspot.com>
Subject Re: Using BK as WAL and accessing ledger metadata
Date Mon, 04 Feb 2013 21:11:29 GMT
Thanks Flavio,

We have been considering reading the BK state out of ZK ourselves. I could
see how this data might be available in a round-about (not advised way)
from the BK client. I don't think we would be needing to manipulate it,
because after we have processed the ledgers we delete them. The only
additional state I believe we would need is simply a lock around a ledger
while it is being processed (moved out of BK.)




On Mon, Feb 4, 2013 at 4:08 PM, Flavio Junqueira <fpjunqueira@yahoo.com>wrote:

> Hi Whitney,
>
> In general we leave it up to the application to organize the ledgers it
> creates. It is indifferent to bookkeeper which ledgers have been created by
> a single writer and and how the content of ledgers relate. Managing this
> kind of application state is something that zookeeper does well and since
> we assume a zookeeper deployment, the application can use it to manage its
> metadata. Although we typically don't recommend that applications access
> the zookeeper metadata for bookkeeper ledgers, there is nothing really that
> prevents you from doing it. If it is useful for you to read this metadata,
> I don't see a big problem with doing it, although I'd like to stress that I
> find it a bad idea to try to manipulate the zookeeper state for bookkeeper
> ledger directly.
>
> On your point about duplication, don't you need to remember which closed
> ledgers have been already processed? Just knowing the list of closed
> ledgers might not be sufficient. If this is the case, then you need to keep
> some additional metadata on the side.
>
> -Flavio
>
> On Feb 4, 2013, at 7:44 PM, Whitney Sorenson <wsorenson@hubspot.com>
> wrote:
>
> Thank you for responding.
>
> Forgive me if I'm missing something, but if I have a writer and separate
> readers, why would I want to have to communicate ledger ids between them?
> More specifically, we have a series of writers writing to a write-ahead log
> and a separate set of readers that are consuming these ledgers to move them
> into long term storage and send them to queues / workflows to be processed.
> This means I have to keep the state about which ledgers are available, and
> which are closed, which seems to be a complete duplication of the state
> that is already in BK.
>
> I'm not sure named ledgers are helpful in this situation, except that we
> could keep less state (perhaps a sequential id.)
>
> On Mon, Feb 4, 2013 at 1:27 PM, Sijie Guo <guosijie@gmail.com> wrote:
>
>>
>> Hello, Whitney:
>>
>> please check the replies inline.
>>
>> On Mon, Feb 4, 2013 at 8:47 AM, Whitney Sorenson <wsorenson@hubspot.com>wrote:
>>
>>> Hey all,
>>>
>>> A couple questions about running BK stand-alone:
>>>
>>> 1) If I call openLedgerNoRecovery am I blocking writes or not? What are
>>> the guarantees I lose - just ordering? Can I use this to essentially read /
>>> tail an active ledger?
>>>
>>
>> open a ledger using openLedgerNoRecovery doesn't block any writes to it.
>> And you don't lose the ordering guarantee. You could use it to read/tail an
>> active ledger, but please keep in mind that you need to call
>> #readLastConfirmed to catch up to the latest confirmed entries added by the
>> writer. And the entries you could read from an openLedgerNoRecovery ledger,
>> is just between 0 and last confirmed.
>>
>> you could check:
>> http://zookeeper.apache.org/bookkeeper/docs/r4.2.0/apidocs/org/apache/bookkeeper/client/BookKeeper.html#asyncOpenLedgerNoRecovery(long,
>> org.apache.bookkeeper.client.BookKeeper.DigestType, byte[],
>> org.apache.bookkeeper.client.AsyncCallback.OpenCallback, java.lang.Object)
>>
>>
>>>
>>> 2) How can I access BK's metadata so that I can determine a list of
>>> ledgers, and which ledgers are closed/open? It doesn't appear in the client
>>> documentation (
>>> http://zookeeper.apache.org/bookkeeper/docs/r4.2.0/apidocs/org/apache/bookkeeper/client/)
>>> Is this not an intended operation? Are clients supposed to track ledger ids
>>> on their own (we are currently doing this but it seems suboptimal)
>>>
>>>
>> currently we don't expose the API for client. Is there any special case
>> you are considering? We'd happy to expose it if necessary.
>>
>>  Since most of the cases are working in following styles: a *standby*
>> writer observes the *active* writer state, if the *active* writer failed,
>> the *standby* writer would take over the responsibility, closed the ledger
>> written by *active* writer, replayed the ledger and created a new ledger to
>> write new entries. For now, clients needs to track ledger ids on their end.
>>
>> There is one proposal working on providing *named* ledgers on top of
>> bookkeeper to ease user's experience tracking ledger ids. You could check :
>> https://issues.apache.org/jira/browse/BOOKKEEPER-220 . And we are under
>> discussion on whether to provide ledger name internally in bookkeeper for
>> metadata access concerns. We'd like to hear your feedback on the usage of
>> API and make it better.
>>
>>
>>
>>> Thank you;
>>>
>>> -Whitney Sorenson
>>> HubSpot
>>>
>>>
>>
>
>

Mime
View raw message