Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6D10210445 for ; Thu, 23 Jan 2014 10:46:35 +0000 (UTC) Received: (qmail 24692 invoked by uid 500); 23 Jan 2014 10:46:32 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 24642 invoked by uid 500); 23 Jan 2014 10:46:31 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 23981 invoked by uid 99); 23 Jan 2014 10:46:30 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Jan 2014 10:46:30 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of michael_segel@hotmail.com designates 65.55.111.111 as permitted sender) Received: from [65.55.111.111] (HELO blu0-omc2-s36.blu0.hotmail.com) (65.55.111.111) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Jan 2014 10:46:23 +0000 Received: from BLU0-SMTP393 ([65.55.111.73]) by blu0-omc2-s36.blu0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Thu, 23 Jan 2014 02:46:02 -0800 X-TMN: [yFh9E56gRahmKPjTrYRD2WHusNyQAofC] X-Originating-Email: [michael_segel@hotmail.com] Message-ID: Received: from 173-15-87-33-illinois.hfc.comcastbusiness.net ([173.15.87.33]) by BLU0-SMTP393.phx.gbl over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Thu, 23 Jan 2014 02:45:59 -0800 Content-Type: text/plain; charset="windows-1252" MIME-Version: 1.0 (Mac OS X Mail 7.1 \(1827\)) Subject: Re: Design review: Secondary index support through coprocess From: Michael Segel In-Reply-To: Date: Thu, 23 Jan 2014 04:45:55 -0600 Content-Transfer-Encoding: quoted-printable References: <52CEAAFA0196073C003A00A2_0_821354@p058> <-4494665118441587577@unknownmsgid> , , To: dev@hbase.apache.org X-Mailer: Apple Mail (2.1827) X-OriginalArrivalTime: 23 Jan 2014 10:45:59.0353 (UTC) FILETIME=[4D692A90:01CF1828] X-Virus-Checked: Checked by ClamAV on apache.org Wow. That's the first time in 25 years that I've heard someone actually = reference the dining philosophers problem.=20 ;-) On Jan 22, 2014, at 1:35 PM, Wei Tan wrote: > Thanks, Vladimir. So a RPC call RS1 --> RS2 takes two handlers, one = from=20 > RS1 and one from RS2? If that is true, then I understand that it is a=20= > typical Dining philosophers problem. >=20 > Maybe a random yielding mechanism can solve this problem. > Best regards, > Wei >=20 > --------------------------------- > Wei Tan, PhD > Research Staff Member > IBM T. J. Watson Research Center > http://researcher.ibm.com/person/us-wtan >=20 >=20 >=20 > From: Vladimir Rodionov > To: "dev@hbase.apache.org" ,=20 > Date: 01/22/2014 12:09 PM > Subject: RE: Design review: Secondary index support through=20 > coprocess >=20 >=20 >=20 > Deadlocks are possible because cross region RPCs create cyclic=20 > dependencies in HBase cluster. >=20 > RS1-> RS2->RS3->RS1, where -> is PRC call >=20 > now imagine that last call from RS3 to RS1 is blocked because there no=20= > more available handler threads to process it. >=20 > Best regards, > Vladimir Rodionov > Principal Platform Engineer > Carrier IQ, www.carrieriq.com > e-mail: vrodionov@carrieriq.com >=20 > ________________________________________ > From: Wei Tan [wtan@us.ibm.com] > Sent: Wednesday, January 22, 2014 7:51 AM > To: dev@hbase.apache.org > Subject: RE: Design review: Secondary index support through coprocess >=20 > Why cross-RS RPC is going to cause deadlocks? It is a matter of logic > incorrectness, or resource outage? Say, if we set the #handler to be > large, logically deadlock still occurs? > Best regards, > Wei >=20 >=20 >=20 >=20 > From: Vladimir Rodionov > To: "dev@hbase.apache.org" , > Date: 01/20/2014 03:00 PM > Subject: RE: Design review: Secondary index support through > coprocess >=20 >=20 >=20 >>> Yes, the coprocessors potentially cross RS boundaries. >=20 > The open path to the disaster. Inter region RPCs in coprocessors may > result in periodic cluster - wide deadlocks >=20 >=20 > Best regards, > Vladimir Rodionov > Principal Platform Engineer > Carrier IQ, www.carrieriq.com > e-mail: vrodionov@carrieriq.com >=20 > ________________________________________ > From: James Taylor [jtaylor@salesforce.com] > Sent: Monday, January 20, 2014 11:39 AM > To: dev@hbase.apache.org > Subject: Re: Design review: Secondary index support through coprocess >=20 > Yes, the coprocessors potentially cross RS boundaries. No, the index = is > not > co-located with the main table. Take a look at the link I sent as that > should be able to answer a lot of questions. >=20 > Thanks, > James >=20 >=20 > On Mon, Jan 20, 2014 at 11:03 AM, Michael Segel > wrote: >=20 >> James, >>=20 >> Ok=85 >>=20 >> Its been a while since we talked about this=85 >>=20 >> While the index is in a separate table, is that table being split and >> collocated with the main table? >>=20 >> If you=92re using the coprocessor to maintain the index, that would = imply >> you=92re crossing RS boundaries if your index is truly orthogonal. >>=20 >> Is this what you=92re doing? >>=20 >> On Jan 20, 2014, at 11:32 AM, James Taylor > wrote: >>=20 >>> Mike, >>> Yes, you're mistaken: >>> - secondary indexes in Phoenix are orthogonal to the base table. > They're >> in >>> a separate table ( >>> http://phoenix.incubator.apache.org/secondary_indexing.html). >>> - Phoenix has joins. They're in our master branch with a release >> scheduled >>> for next month >>> - numeric strings? Not a use case for indexing numeric data? Have = you >> ever >>> seen a number used as an ID? >>> Thanks, >>> James >>>=20 >>>=20 >>> On Mon, Jan 20, 2014 at 8:50 AM, Michael Segel < >> michael_segel@hotmail.com>wrote: >>>=20 >>>> Indexes tend to be orthogonal to the base table, not to mention if >> you=92re >>>> using an inverted table for an index, your index table would be = much >>>> thinner than your base table. >>>>=20 >>>> Having said that, the solution proposed by Yu, Taylor and others = only >>>> works if you want to use the index to help on server side filtering > and >>>> misses the boat on the larger and broader picture of improving = query >>>> optimization and joins. >>>>=20 >>>> HINT: Unless I am mistaken=85 until you treat the index as = orthogonal > to >> the >>>> base table, you will always lag performance of traditional MPP DWs > like >>>> Informix XPS. (Now part of IBM=92s IM pillar ) >>>>=20 >>>> In addition, until you fix coprocessors in general, you will have >>>> scalability and performance issues. >>>> (Note that you can write a coprocessor to create a sandbox and > separate >>>> the co-process from the RS jvm, however it would be better if it = were >> part >>>> of the underlying coprocessor code. ) >>>>=20 >>>> The current implementation makes joins worthless. >>>> (Note that in prior discussions, Phoenix doesn=92t do joins=85) >>>> Here=92s why: >>>> In order to do a join, if you use the proposed index, you have to > first >>>> reduce each index in to a single, sort ordered set. Then you can > take >> the >>>> intersection of the index result sets. The final set would be in > sort >>>> order and a subset of the total rows. You can then fetch the rows = and >> still >>>> do a server side filter before returning the ultimate result set. >>>>=20 >>>> Its that first step of reducing each result set in to a single sort >>>> ordered set that takes a lot of effort. >>>>=20 >>>>=20 >>>> On a side note=85. there=92s been some mention of ordering floats. = Again, >> just >>>> a word of caution=85 there isn=92t a really strong use case for = indexing >>>> numeric data types. period. And to be very, very clear, there is a >>>> distinction between numeric strings and numeric data types. >>>>=20 >>>> -Mike >>>>=20 >>>> PS. Because of my role as a consultant, I am very, very limited in > what >> I >>>> can say and contribute. I don=92t own my work product, my clients = do. > Take >>>> what I say with a grain of salt. I=92m just a skinny little boy = from >>>> Cleveland Ohio, come to chase your beers and drink your women=85 = ;-) >>>>=20 >>>> On Jan 9, 2014, at 10:48 AM, James Taylor >> wrote: >>>>=20 >>>>> IMHO, it would be valuable if the design considered both a global >>>>> indexing solution and a local indexing solution. Both are useful = in >>>>> different circumstances. The global indexing design plus the >>>>> application integration points could be derived from Jesse's work > with >>>>> his reference implementation in Phoenix - the global indexing code > has >>>>> no Phoenix dependencies and clearly defined integration points. >>>>>=20 >>>>> Thanks, >>>>> James >>>>>=20 >>>>> On Jan 9, 2014, at 6:36 AM, Jesse Yates >> wrote: >>>>>=20 >>>>>> Yes, that was a big concern I had as well. >>>>>>=20 >>>>>> It's not clear how that will work with a large number of indexes; > if >>>> people >>>>>> have one index, they will want more than one. To not plan for = that >> seems >>>>>> like an incomplete implementation to me. In a horizontally = scalable >>>> system >>>>>> like HBase, lots of buddy region isn't going to work out well..* > Once >> we >>>>>> have regions that cannot be collocated, the extra RPC time starts > to >> be >>>> the >>>>>> biggest factor (as the doc points out) and we are back to what > Phoenix >>>> is >>>>>> already doing**. >>>>>>=20 >>>>>> But I'm probably missing something here in what makes it = different? >>>>>>=20 >>>>>> For folks that haven't been following the issue some high-level > "how >> it >>>> all >>>>>> kinda works" would be helpful from the championing commiters; > that's a >>>> long >>>>>> doc to get through and grok :). How similar is this to the work >>>> currently >>>>>> by the existing indexing implementations (huawei, Phoenix, = ngdata)? >> The >>>> doc >>>>>> doesn't really nail down the interactions, but instead just right > in >>>> after >>>>>> describing why SI should be added. >>>>>>=20 >>>>>> Agree this would be super useful, but don't want to waste too = much >> work >>>>>> reinventing the wheel or doing the wrong thing. further, this = impl >>>> quickly >>>>>> starts to lead down the query optimization path, which get HBase > away >>>> from >>>>>> its core "be a great byte store". >>>>>>=20 >>>>>> Like I said, I'm all for secondary indexes in HBase and think = this > is >> a >>>>>> great push. I don't mean to rain on any parades. >>>>>>=20 >>>>>> - jesse >>>>>>=20 >>>>>> * but a smart way to specify region collocation? That I can get > behind >>>> as >>>>>> it would unify a couple different indexing impls (e.g Phoenix = would >>>>>> consider using it to help make indexing faster - RPCs do suck). >>>>>>=20 >>>>>> ** for instance, the doc talks about how to implement indexing = for >>>>>> floats... That might be a default impl, but for use cases like > Phoenix >>>> this >>>>>> would break all our current encodings. We handled this is the > indexing >>>> impl >>>>>> by making the builder pluggable for different use cases to = support >>>>>> different encodings. I feel like a lot of the code for this kind = of > SI >>>>>> impl is already in Phoenix and has been working and fast for > several >>>> months >>>>>> now; it's surprisingly tricky, especially with the delete cases = and >> time >>>>>> stamp manipulation issues. >>>>>>=20 >>>>>>=20 >>>>>> On Thursday, January 9, 2014, Sudarshan Kadambi (BLOOMBERG/ 731 > LEXIN) >>>>>> wrote: >>>>>>=20 >>>>>>> Could you explain how the 1-1 association between user and index >> table >>>>>>> regions is maintained. I wasn't able to understand fully from = the >>>> document. >>>>>>>=20 >>>>>>> ----- Original Message ----- >>>>>>> From: Ted Yu >>>>>>> To: dev@hbase.apache.org >>>>>>> At: Jan 8, 2014 3:41:40 PM >>>>>>>=20 >>>>>>> Hi, >>>>>>> Secondary index support is a frequently requested feature. >>>>>>>=20 >>>>>>> Please find the updated design doc here: >>>>>>>=20 >>>>>>>=20 >>>>=20 >>=20 > = https://issues.apache.org/jira/secure/attachment/12621909/SecondaryIndex%2= 0Design_Updated_2.pdf >=20 >=20 >>>>>>>=20 >>>>>>> HBASE-9203 is the umbrella JIRA. >>>>>>>=20 >>>>>>> Implementation patch was attached to HBASE-10222 >>>>>>>=20 >>>>>>> Thanks to Rajesh who works on this feature. >>>>>>>=20 >>>>>>> Cheers >>>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>> -- >>>>>> ------------------- >>>>>> Jesse Yates >>>>>> @jesse_yates >>>>>> jyates.github.com >>>>>=20 >>>>=20 >>>>=20 >>=20 >>=20 >=20 > Confidentiality Notice: The information contained in this message, > including any attachments hereto, may be confidential and is intended = to > be read only by the individual or entity to whom this message is > addressed. If the reader of this message is not the intended recipient = or > an agent or designee of the intended recipient, please note that any > review, use, disclosure or distribution of this message or its > attachments, in any form, is strictly prohibited. If you have = received > this message in error, please immediately notify the sender and/or > Notifications@carrieriq.com and delete or destroy any copy of this = message > and its attachments. >=20 >=20 >=20 >=20 > Confidentiality Notice: The information contained in this message,=20 > including any attachments hereto, may be confidential and is intended = to=20 > be read only by the individual or entity to whom this message is=20 > addressed. If the reader of this message is not the intended recipient = or=20 > an agent or designee of the intended recipient, please note that any=20= > review, use, disclosure or distribution of this message or its=20 > attachments, in any form, is strictly prohibited. If you have = received=20 > this message in error, please immediately notify the sender and/or=20 > Notifications@carrieriq.com and delete or destroy any copy of this = message=20 > and its attachments. >=20 >=20 >=20 The opinions expressed here are mine, while they may reflect a cognitive = thought, that is purely accidental.=20 Use at your own risk.=20 Michael Segel michael_segel (AT) hotmail.com