Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@locus.apache.org Received: (qmail 31651 invoked from network); 29 Apr 2008 10:51:33 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 29 Apr 2008 10:51:33 -0000 Received: (qmail 73045 invoked by uid 500); 29 Apr 2008 10:51:35 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 73031 invoked by uid 500); 29 Apr 2008 10:51:35 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 73012 invoked by uid 99); 29 Apr 2008 10:51:34 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Apr 2008 03:51:34 -0700 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of darkit@gmail.com designates 209.85.198.232 as permitted sender) Received: from [209.85.198.232] (HELO rv-out-0506.google.com) (209.85.198.232) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Apr 2008 10:50:50 +0000 Received: by rv-out-0506.google.com with SMTP id k40so5057876rvb.29 for ; Tue, 29 Apr 2008 03:51:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; bh=1baU8rZ44nzaNN/DbjpZb/aPxL+S0U0XCfTvToM3M7M=; b=TSU//psAF8sZOz08XDefWYhSwRtMiB0ytIIpq/KLZXilNpQOMdgfVlpuALsFlKZo1c9fQY7xgLKSkzPFgYtHBSkKwc0WK1uua8IW05qrUNcGLd8B5PAB+Hn9rCG+UHDotp5F0oz6HDWfqG+FQbzCKRYfVfbBAWmvrDlpIw0zOlA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=o3RJ9vQ4y6vWuZm3DzqcKVfbd1fBphVDEUqiSASHx1MtLhYegW5LYNQO01Pa7gqxx/0Cd3o31HNZn5hcBa9/iTQHecNE163mz2TtuBnaWHdnOkMZjid37LHycLHR3pyxENT7hqS6eOpSTr8m494bqR34evSCVqHX9du1d9rMN5M= Received: by 10.141.20.7 with SMTP id x7mr870748rvi.61.1209466263694; Tue, 29 Apr 2008 03:51:03 -0700 (PDT) Received: by 10.141.169.21 with HTTP; Tue, 29 Apr 2008 03:51:03 -0700 (PDT) Message-ID: Date: Tue, 29 Apr 2008 13:51:03 +0300 From: "Max Grigoriev" To: hbase-user@hadoop.apache.org Subject: Re: Is HBase suitable for ... In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_12685_3471355.1209466263679" References: X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_12685_3471355.1209466263679 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Replies and questions inline. > > On Apr 28, 2008, at 2:57 PM, Max Grigoriev wrote: > > > What kind of search on different table attributes do you want to do? > There are no general purpose secondary indexes in HBase, so you > either have to do a full- or partial-table scan or put the search > attribute in the primary key. > The system is the core of different social networks so it should be able to make search on every attribute. Because during core development you don't know all entities and all search queries. So I think to use hibernate mapping (no relations - many-to-one and etc... just single attributes) where user can describe entity and if this entity is index. And in this case system will create secondary index. As HBase doesn't support secondary indexes , I think I'll be able to emulate them just creating thme by hands secondary index -> primary index as it's done in Berkeley DB for example. As far as failover, at the moment, HBase has good recovery for region > servers, and no recovery for the master. That's something we're > hoping to change in the future. > Is that future near or far ? Can I create new master in case of initial master failure? Can master have slaves? > Can you tell me is HBase will work for such system? > I think HBase can do what you need, but it'd be nice to have more > details about what exactly you're going to do with it. > i don't know :) because aplication developer will decide what entities and what they do. What I have to do is to create enviroment for easy creation of applications. > If we have 2 or 3 data centers and we loose connection between them > > - what > > behavior of HBase will we see ? > Is your intent to run a single HBase instance across several data > centers? > Yes, because you don't know which datacenter can be down. At the moment, if a regionserver is cut off from the master, > it will kill itself. This means that if you have your master at one > location and regionservers at another, and you lose connectivity, > your regionservers at the other locations will shut themselves down. > There are solutions to this we've discussed in the past. However, I > wonder if maybe the correct solution is not to partition across data > centers. It's not something that we've discussed at great length yet, > so there might be an easier way to do it than I'm thinking. > If one datacenter goes down and it holds unique data then you can't continue to work. It's bad. So it's better to have data in both datacenter and if one of them is dead, you can continue to work. > And when we restore connection in 1-2 hours - what should we expect > > from > > HBase ? > This is where things would get sticky - how do you resolve conflicts > in how data is being served, or worse, how it was split into regions? > It seems inherently complicated and unpleasant. > > > You can update all records of restored node by update timestamp. ------=_Part_12685_3471355.1209466263679--