Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 718491062A for ; Mon, 20 Jan 2014 22:54:16 +0000 (UTC) Received: (qmail 56864 invoked by uid 500); 20 Jan 2014 22:54:14 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 56722 invoked by uid 500); 20 Jan 2014 22:54:13 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 56709 invoked by uid 99); 20 Jan 2014 22:54:13 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Jan 2014 22:54:12 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jesse.k.yates@gmail.com designates 209.85.220.174 as permitted sender) Received: from [209.85.220.174] (HELO mail-vc0-f174.google.com) (209.85.220.174) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Jan 2014 22:54:08 +0000 Received: by mail-vc0-f174.google.com with SMTP id im17so3147296vcb.33 for ; Mon, 20 Jan 2014 14:53:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=wzEAksN1gG7MIIKNqZhMERaT6/QKOxVXoCq888bC+40=; b=p+dYYsGRl1WpLX5VfWtE1h7SnfMEiFjh0UbDAP7wYr8Syzkt2ZLeqhTjEOc0S6Bncp 7QXLkQME9iU/mYBKuK+yIpzH8bsPnbt5oVlkgDjdKt79DS/VuRTNl1tBSP1C7+mDiHl6 Du+lHtox4pOkMn4G+B6g3PqB8Drgg/DMQTt+Li23EL2WBRv1ZQpl7VzzT81hP7k38WI0 xHePKs9p1hOtkWJqy8+ZGQcXU+AUXVtUsMlTQi7mz0yEi9gCzQqzOZal+U+oi07dMXOh Mtg1LodSLqEIWIWFyTWlQ5XZTc7dLUFQ5ydpSffBImTfXkF61dhVh1zOHDhZt6kFpWPp VPTA== MIME-Version: 1.0 X-Received: by 10.58.66.137 with SMTP id f9mr6572054vet.11.1390258427879; Mon, 20 Jan 2014 14:53:47 -0800 (PST) Received: by 10.58.218.198 with HTTP; Mon, 20 Jan 2014 14:53:47 -0800 (PST) Received: by 10.58.218.198 with HTTP; Mon, 20 Jan 2014 14:53:47 -0800 (PST) In-Reply-To: References: <52CEAAFA0196073C003A00A2_0_821354@p058> <-4494665118441587577@unknownmsgid> <1390254853.96356.YahooMailNeo@web140605.mail.bf1.yahoo.com> Date: Mon, 20 Jan 2014 14:53:47 -0800 Message-ID: Subject: Re: Design review: Secondary index support through coprocess From: Jesse Yates To: dev@hbase.apache.org Content-Type: multipart/alternative; boundary=047d7b33d856bd924e04f06ec3a4 X-Virus-Checked: Checked by ClamAV on apache.org --047d7b33d856bd924e04f06ec3a4 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable That's easier said than done, when dealing with HBase writes...In Phoenix, we kind of get around the deadlock issue by separating out the write layer for indexes to make it queuable and using our own thread pools to make it somewhat async. It just ends up being hard to ensure consistency when you don't block on the write being committed :) I started working on doing a custom RPC handler for index writes, but its only really doable on 96+... The work for 94 is really invasive; the solution ends up being "have a lot of handlers" and don't clobber them. We do get some wins using the "backdoor" writes when the region is on the same server, but obviously doesn't solve the problem entirely. -- jesse On Jan 20, 2014 2:46 PM, "Andrew Purtell" wrote: > Or don't do blocking I/O in the context of the RPC handler thread. Queue > the work and let the handler return. > > > On Mon, Jan 20, 2014 at 1:54 PM, lars hofhansl wrote: > > > Yep. That's my concern too. Would need to configure a generous number o= f > > handlers to prevent this from happening. > > > > ________________________________ > > From: Vladimir Rodionov > > To: "dev@hbase.apache.org" > > Sent: Monday, January 20, 2014 11:57 AM > > Subject: RE: Design review: Secondary index support through coprocess > > > > >>Yes, the coprocessors potentially cross RS boundaries. > > > > The open path to the disaster. Inter region RPCs in coprocessors may > > result in periodic cluster - wide deadlocks > > > > > > Best regards, > > Vladimir Rodionov > > Principal Platform Engineer > > Carrier IQ, www.carrieriq.com > > e-mail: vrodionov@carrieriq.com > > > > ________________________________________ > > > > From: James Taylor [jtaylor@salesforce.com] > > Sent: Monday, January 20, 2014 11:39 AM > > To: dev@hbase.apache.org > > Subject: Re: Design review: Secondary index support through coprocess > > > > Yes, the coprocessors potentially cross RS boundaries. No, the index is > not > > co-located with the main table. Take a look at the link I sent as that > > should be able to answer a lot of questions. > > > > Thanks, > > James > > > > > > On Mon, Jan 20, 2014 at 11:03 AM, Michael Segel > > wrote: > > > > > James, > > > > > > Ok=85 > > > > > > Its been a while since we talked about this=85 > > > > > > While the index is in a separate table, is that table being split and > > > collocated with the main table? > > > > > > If you=92re using the coprocessor to maintain the index, that would i= mply > > > you=92re crossing RS boundaries if your index is truly orthogonal. > > > > > > Is this what you=92re doing? > > > > > > On Jan 20, 2014, at 11:32 AM, James Taylor > > wrote: > > > > > > > Mike, > > > > Yes, you're mistaken: > > > > - secondary indexes in Phoenix are orthogonal to the base table. > > They're > > > in > > > > a separate table ( > > > > http://phoenix.incubator.apache.org/secondary_indexing.html). > > > > - Phoenix has joins. They're in our master branch with a release > > > scheduled > > > > for next month > > > > - numeric strings? Not a use case for indexing numeric data? Have y= ou > > > ever > > > > seen a number used as an ID? > > > > Thanks, > > > > James > > > > > > > > > > > > On Mon, Jan 20, 2014 at 8:50 AM, Michael Segel < > > > michael_segel@hotmail.com>wrote: > > > > > > > >> Indexes tend to be orthogonal to the base table, not to mention if > > > you=92re > > > >> using an inverted table for an index, your index table would be mu= ch > > > >> thinner than your base table. > > > >> > > > >> Having said that, the solution proposed by Yu, Taylor and others > only > > > >> works if you want to use the index to help on server side filterin= g > > and > > > >> misses the boat on the larger and broader picture of improving que= ry > > > >> optimization and joins. > > > >> > > > >> HINT: Unless I am mistaken=85 until you treat the index as orthogo= nal > to > > > the > > > >> base table, you will always lag performance of traditional MPP DWs > > like > > > >> Informix XPS. (Now part of IBM=92s IM pillar ) > > > >> > > > >> In addition, until you fix coprocessors in general, you will have > > > >> scalability and performance issues. > > > >> (Note that you can write a coprocessor to create a sandbox and > > separate > > > >> the co-process from the RS jvm, however it would be better if it > were > > > part > > > >> of the underlying coprocessor code. ) > > > >> > > > >> The current implementation makes joins worthless. > > > >> (Note that in prior discussions, Phoenix doesn=92t do joins=85) > > > >> Here=92s why: > > > >> In order to do a join, if you use the proposed index, you have to > > first > > > >> reduce each index in to a single, sort ordered set. Then you can > take > > > the > > > >> intersection of the index result sets. The final set would be in > sort > > > >> order and a subset of the total rows. You can then fetch the rows > and > > > still > > > >> do a server side filter before returning the ultimate result set. > > > >> > > > >> Its that first step of reducing each result set in to a single sor= t > > > >> ordered set that takes a lot of effort. > > > >> > > > >> > > > >> On a side note=85. there=92s been some mention of ordering floats. > Again, > > > just > > > >> a word of caution=85 there isn=92t a really strong use case for in= dexing > > > >> numeric data types. period. And to be very, very clear, there is = a > > > >> distinction between numeric strings and numeric data types. > > > >> > > > >> -Mike > > > >> > > > >> PS. Because of my role as a consultant, I am very, very limited in > > what > > > I > > > >> can say and contribute. I don=92t own my work product, my clients = do. > > Take > > > >> what I say with a grain of salt. I=92m just a skinny little boy f= rom > > > >> Cleveland Ohio, come to chase your beers and drink your women=85 ;= -) > > > >> > > > >> On Jan 9, 2014, at 10:48 AM, James Taylor > > > wrote: > > > >> > > > >>> IMHO, it would be valuable if the design considered both a global > > > >>> indexing solution and a local indexing solution. Both are useful = in > > > >>> different circumstances. The global indexing design plus the > > > >>> application integration points could be derived from Jesse's work > > with > > > >>> his reference implementation in Phoenix - the global indexing cod= e > > has > > > >>> no Phoenix dependencies and clearly defined integration points. > > > >>> > > > >>> Thanks, > > > >>> James > > > >>> > > > >>> On Jan 9, 2014, at 6:36 AM, Jesse Yates > > > wrote: > > > >>> > > > >>>> Yes, that was a big concern I had as well. > > > >>>> > > > >>>> It's not clear how that will work with a large number of indexes= ; > if > > > >> people > > > >>>> have one index, they will want more than one. To not plan for th= at > > > seems > > > >>>> like an incomplete implementation to me. In a horizontally > scalable > > > >> system > > > >>>> like HBase, lots of buddy region isn't going to work out well..* > > Once > > > we > > > >>>> have regions that cannot be collocated, the extra RPC time start= s > to > > > be > > > >> the > > > >>>> biggest factor (as the doc points out) and we are back to what > > Phoenix > > > >> is > > > >>>> already doing**. > > > >>>> > > > >>>> But I'm probably missing something here in what makes it > different? > > > >>>> > > > >>>> For folks that haven't been following the issue some high-level > "how > > > it > > > >> all > > > >>>> kinda works" would be helpful from the championing commiters; > > that's a > > > >> long > > > >>>> doc to get through and grok :). How similar is this to the work > > > >> currently > > > >>>> by the existing indexing implementations (huawei, Phoenix, > ngdata)? > > > The > > > >> doc > > > >>>> doesn't really nail down the interactions, but instead just righ= t > in > > > >> after > > > >>>> describing why SI should be added. > > > >>>> > > > >>>> Agree this would be super useful, but don't want to waste too mu= ch > > > work > > > >>>> reinventing the wheel or doing the wrong thing. further, this im= pl > > > >> quickly > > > >>>> starts to lead down the query optimization path, which get HBase > > away > > > >> from > > > >>>> its core "be a great byte store". > > > >>>> > > > >>>> Like I said, I'm all for secondary indexes in HBase and think th= is > > is > > > a > > > >>>> great push. I don't mean to rain on any parades. > > > >>>> > > > >>>> - jesse > > > >>>> > > > >>>> * but a smart way to specify region collocation? That I can get > > behind > > > >> as > > > >>>> it would unify a couple different indexing impls (e.g Phoenix > would > > > >>>> consider using it to help make indexing faster - RPCs do suck). > > > >>>> > > > >>>> ** for instance, the doc talks about how to implement indexing f= or > > > >>>> floats... That might be a default impl, but for use cases like > > Phoenix > > > >> this > > > >>>> would break all our current encodings. We handled this is the > > indexing > > > >> impl > > > >>>> by making the builder pluggable for different use cases to suppo= rt > > > >>>> different encodings. I feel like a lot of the code for this kind > of > > SI > > > >>>> impl is already in Phoenix and has been working and fast for > several > > > >> months > > > >>>> now; it's surprisingly tricky, especially with the delete cases > and > > > time > > > >>>> stamp manipulation issues. > > > >>>> > > > >>>> > > > >>>> On Thursday, January 9, 2014, Sudarshan Kadambi (BLOOMBERG/ 731 > > LEXIN) > > > >>>> wrote: > > > >>>> > > > >>>>> Could you explain how the 1-1 association between user and inde= x > > > table > > > >>>>> regions is maintained. I wasn't able to understand fully from t= he > > > >> document. > > > >>>>> > > > >>>>> ----- Original Message ----- > > > >>>>> From: Ted Yu > > > >>>>> To: dev@hbase.apache.org > > > >>>>> At: Jan 8, 2014 3:41:40 PM > > > >>>>> > > > >>>>> Hi, > > > >>>>> Secondary index support is a frequently requested feature. > > > >>>>> > > > >>>>> Please find the updated design doc here: > > > >>>>> > > > >>>>> > > > >> > > > > > > https://issues.apache.org/jira/secure/attachment/12621909/SecondaryIndex%= 20Design_Updated_2.pdf > > > >>>>> > > > >>>>> HBASE-9203 is the umbrella JIRA. > > > >>>>> > > > >>>>> Implementation patch was attached to HBASE-10222 > > > >>>>> > > > >>>>> Thanks to Rajesh who works on this feature. > > > >>>>> > > > >>>>> Cheers > > > >>>>> > > > >>>> > > > >>>> > > > >>>> -- > > > >>>> ------------------- > > > >>>> Jesse Yates > > > >>>> @jesse_yates > > > >>>> jyates.github.com > > > >>> > > > >> > > > >> > > > > > > > > > > Confidentiality Notice: The information contained in this message, > > including any attachments hereto, may be confidential and is intended t= o > be > > read only by the individual or entity to whom this message is addressed= . > If > > the reader of this message is not the intended recipient or an agent or > > designee of the intended recipient, please note that any review, use, > > disclosure or distribution of this message or its attachments, in any > form, > > is strictly prohibited. If you have received this message in error, > please > > immediately notify the sender and/or Notifications@carrieriq.com and > > delete or destroy any copy of this message and its attachments. > > > > > > -- > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) > --047d7b33d856bd924e04f06ec3a4--