Return-Path: Delivered-To: apmail-hadoop-zookeeper-user-archive@minotaur.apache.org Received: (qmail 38448 invoked from network); 9 Oct 2010 03:10:10 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 9 Oct 2010 03:10:10 -0000 Received: (qmail 46199 invoked by uid 500); 9 Oct 2010 03:10:10 -0000 Delivered-To: apmail-hadoop-zookeeper-user-archive@hadoop.apache.org Received: (qmail 46008 invoked by uid 500); 9 Oct 2010 03:10:08 -0000 Mailing-List: contact zookeeper-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: zookeeper-user@hadoop.apache.org Delivered-To: mailing list zookeeper-user@hadoop.apache.org Received: (qmail 46000 invoked by uid 99); 9 Oct 2010 03:10:07 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 09 Oct 2010 03:10:07 +0000 X-ASF-Spam-Status: No, hits=0.7 required=10.0 tests=RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [216.145.54.173] (HELO mrout3.yahoo.com) (216.145.54.173) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 09 Oct 2010 03:10:02 +0000 Received: from [10.72.168.46] (snvvpn4-10-72-168-c46.hq.corp.yahoo.com [10.72.168.46]) by mrout3.yahoo.com (8.13.8/8.13.8/y.out) with ESMTP id o9939Zrn025182 for ; Fri, 8 Oct 2010 20:09:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=yahoo-inc.com; s=cobra; t=1286593775; bh=Jl3KKsR3wkWMh5L5DflmW/i+TIXfQuuACwRRjO9P7Ec=; h=Message-ID:Date:From:MIME-Version:To:Subject:References: In-Reply-To:Content-Type:Content-Transfer-Encoding; b=EHrWFtQUmf8wE0pMuS6vgw6vnJyQfeliQ5kFFBUI0U+opnHnZc4Ag9SXNAaT9GjlD 50k+BtjJZJoTDRhbk1BpvvFwkQC7of8UxmdPDHgGKGYNI6n1qKZGIa16Tozgtf2qNf KPqcDzYZvf41uSOp8XixO2vrF4DIZIShuk7WGolo= Message-ID: <4CAFDCEC.80408@yahoo-inc.com> Date: Fri, 08 Oct 2010 20:09:32 -0700 From: Benjamin Reed User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.9) Gecko/20100922 Thunderbird/3.1.4 MIME-Version: 1.0 To: zookeeper-user@hadoop.apache.org Subject: Re: Question on production readiness, deployment, data of BookKeeper / Hedwig References: <763522.20983.qm@web31812.mail.mud.yahoo.com> <4CADF20F.5020906@yahoo-inc.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit your guess is correct :) for bookkeeper and hedwig we released early to do the development in public. originally we developed bookkeeper as a distributed write ahead log for the NameNode in HDFS, but while we were able to get a proof of concept going, the structure of the code of the NameNode makes it difficulty to integrate well. we are currently working on fixing the write ahead layer of the NameNode, which is taking a lot of time. in the meantime we applied bookkeeper to pub/sub and came up with hedwig, which is where most of our efforts are focused while the slow processing of pushing changes to the NameNode proceeds. ben On 10/08/2010 02:32 PM, Jake Mannix wrote: > Hi Ben, > > To follow up with this question, which seems to be asking primarily about > Hedwig (and I guess the answer is: it's not in production yet, anywhere), > with one more about Bookkeeper: is BookKeeper used in production as a WAL > (or for any other use) anywhere? If so, for what uses? > > Any info (even anecdotal) would be great! > > -jake > > On Thu, Oct 7, 2010 at 9:15 AM, Benjamin Reed wrote: > >> hi amit, >> >> sorry for the late response. this week has been crunch time for a lot of >> different things. >> >> here are your answers: >> >> production >> >> 1. it is still in prototype phase. we are evaluating different aspects, but >> there is still some work to do to make it production ready. we also need to >> get an engineering team to signup to stand behind it. >> >> 2. it's a generic pub/sub message bus. in some sense it is really a >> datacenter solution with extensions for multi-data center operation, so it >> is perfectly reasonable to use it in a single datacenter setting. >> >> 3. yeah, we have removed the hw.bash script. it had some hardcoded >> assumptions and was a swiss army knife on steroids. he have been breaking it >> up into simpler scripts. >> >> 4. session expiry really represents a fundamental connectivity problem, so >> both bk and hedwig restart the component that gets the expired session >> errror. >> >> data >> >> 1. yes. >> >> 2. once all subscribers have consumed a message there is a background >> process that cleans it up. >> >> 3. yes there is a replication factor and we ensure replication on writes >> and there is a recovery tool to recover bookies that fail. we don't have to >> worry about conflicts because there is only a single writer for a give >> ledger. because of this we do not need to do quorum reads. >> >> documentation >> >> yes, this is something we need to work on. i'll see if i can push out some >> of our hello world applications. we'd also like to put a JMS API on top so >> that the API is more familiar (and documented :). i don't want to delay the >> answers to your other questions, so let me answer that HedwigSubscriber is >> the class for clients. the other classes are internal. (for cross data >> center hubs use a special kind of subscriptions to do cross data center >> updates.) >> >> ben >> >> On 10/05/2010 10:32 PM, amit jaiswal wrote: >> >>> Hi, >>> >>> In Hedwig talk (http://vimeo.com/13282102), it was mentioned that the >>> primary >>> use case for Hedwig comes from the distributed key-value store PNUTS in >>> Yahoo!, >>> but also said that the work is new. >>> >>> Could you please about the following: >>> >>> Production readiness / Deployment >>> 1. What is the production readiness of Hedwig / BookKeeper. Is it being >>> used >>> anywhere (like in PNUTS)? >>> 2. Is Hedwig designed to use as a generic message bus or only for >>> multi-datacenter operations? >>> 3. Hedwig installation and deployment is done through a script hw.bash, >>> but that >>> is difficult to use especially in a production environment. Are there any >>> other >>> packages available that can simplify the deployment of hedwig. >>> 4. How does BK/Hedwig handle zookeeper session expiry? >>> >>> Data Deletion, Handling data loss, Quorum >>> 1. Does BookKeeper support deletion of old log entries which have been >>> consumed. >>> 2. How does Hedwig handles the case when all subscribers have consumed all >>> the >>> messages. In the talk, it was said that a subscriber can come back after >>> hours, >>> days or weeks. Is there any data retention / expiration policy for the >>> data that >>> is published? >>> 3. How does Hedwig handles data loss? There is a replication factor, and a >>> write >>> operation must be accepted by majority of the bookies, but how data >>> conflicts >>> are handled? Is there any possibility of data conflict at all? Is the >>> replication only for recovery? When the hub is reading data from bookies, >>> does >>> it reads from all the bookies to satisfy quorum read? >>> >>> Code >>> What is the difference between PubSubServer, HedwigSubscriber, >>> HedwigHubSubscriber. Is there any HelloWorld program that simply >>> illustrates how >>> to instantiate a hedwig client, and publish/consume messages. >>> (HedwigBenchmark >>> class is helpful, but was looking something like API documentation). >>> >>> -regards >>> Amit >>> >>