bookkeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel S. Kim" <>
Subject Re: Hedwig Subproject::Hubs owning some topics that doesn't exist.
Date Thu, 22 Dec 2011 16:13:13 GMT
I thought the messages persisted in the bookies even after someone consumes
it. I had one test topic with one publisher and one subscriber. I published
about 5 messages to the topic. I subscribed and consumed messages from my
listener, which just prints out the message along with its sequence number.
When I get rid of this listener and start another one, this new listener
will get all previous messages from the topic. How is this possible if
messages are not being piled up somewhere (bookies)? Does the hub keep all
the messages? I am somewhat confused how consuming messages get rid of old
messages. In my thought, they persisted in the bookies. Correct me if I am

Also I would like to contribute by adding delete method (if it is possible)
and topic eviction, etc. However, I feel that I need to study its system,
but I am not seeing very much information at Is
there any other design documentation with more details? Where is the best
place to learn how hedwig is built without 100% digging through codes?


On Thu, Dec 22, 2011 at 9:56 AM, Ivan Kelly <> wrote:

> On Thu, Dec 22, 2011 at 09:25:57AM -0600, Daniel S. Kim wrote:
> > When I say "Delete", I mean that I want all the stuff about that topic to
> > be gone. The reason is I need topic management to see if they are being
> > used or not. If they are not being used for awhile, I expire the topic
> and
> > kill it. This is what I should do to save resources. Imagine a large
> number
> > of hedwig users that start new topics, send messages, etc. All these data
> > build up eventually (and I believe there is no eviction mechanism and
> > policy yet). Even though hedwig lets user to keep messages persistently.
> I
> > don't think it should persist when the user wants it gone.
> The only reason data should build up like this is if there is a user
> subscribed to a topic, and it it hasn't consumed all messages
> published to the topic. Otherwise it should be safe to periodically
> delete garbage collect topics who have no subscribers, but I don't
> think we do this at the moment. It would my great if you could
> contribute this ;)
> Where exactly are you seeing the problem? Is the zookeeper data
> getting to big, or is the problem in bookkeeper, etc?
> >
> > Since you said it would possibly break some of the guarantees, I would
> have
> > to look more into it. If my memory is correct, Ben Reed said adding
> > administrative hedwig function to delete a topic should not be too
> > complicated. If it is indeed complicated to achieve the functionality
> > without breaking the guarantees, I will have to wait or build something
> > around. I need to know little bit more about the hedwig hub
> redistribution
> > and how it works, if it is configurable, etc. Where should I start (i.e.,
> > which java package or classes deal with this)?
> hedwig-server/src/main/java/org/apache/hedwig/server/topic
> &
> hedwig-server/src/main/java/org/apache/hedwig/server/subscriptions
> should cover most of what you're interested in.
> -Ivan

Daniel S. Kim

View raw message