Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@hbase.apache.org
Received-SPF: pass (nike.apache.org: domain of michael_segel@hotmail.com
 designates 65.55.111.111 as permitted sender)
Message-ID: <BLU0-SMTP39385EB933739EC3FD759A48FA60@phx.gbl>
Content-Type: text/plain; charset="windows-1252"
MIME-Version: 1.0 (Mac OS X Mail 7.1 \(1827\))
Subject: Re: Design review: Secondary index support through coprocess
From: Michael Segel <michael_segel@hotmail.com>
In-Reply-To: 
 <OFF6DFE93B.419A433D-ON85257C68.006B5684-85257C68.006B9F5D@us.ibm.com>
Date: Thu, 23 Jan 2014 04:45:55 -0600
Content-Transfer-Encoding: quoted-printable
References: <52CEAAFA0196073C003A00A2_0_821354@p058>
	<CAB5sDNKXaj-r71y5i1y6RB0duCjq227YhF3nUiChnb1pp_L9ZQ@mail.gmail.com>
 <-4494665118441587577@unknownmsgid>
	<BLU0-SMTP3367FD88149B7FED35BC9D68FA50@phx.gbl>
 <CAG_TOPCp=-s4xvNT=+YDDPDXLuA9aW_3u4ENDipfptnSjhx=Vg@mail.gmail.com>
	<BLU0-SMTP134A0B94EA4DC7599AFEDFC8FA50@phx.gbl>,<CAG_TOPAg8YKyKpRgGa8abx=zM3BN3hmcppyAnyVVsfV0kKss=w@mail.gmail.com>
 <DC5EBE7F3610EB4CA5C7E92D76873E1518629B58BA@exchange2007.carrieriq.com>,<OFAEBA4BA4.118E80EF-ON85257C68.0056F70C-85257C68.00571A79@us.ibm.com>
 <DC5EBE7F3610EB4CA5C7E92D76873E1518629B58BF@exchange2007.carrieriq.com>
 <OFF6DFE93B.419A433D-ON85257C68.006B5684-85257C68.006B9F5D@us.ibm.com>
To: dev@hbase.apache.org

Wow.

That's the first time in 25 years that I've heard someone actually =
reference the dining philosophers problem.=20
;-)

On Jan 22, 2014, at 1:35 PM, Wei Tan <wtan@us.ibm.com> wrote:

> Thanks, Vladimir. So a RPC call RS1 --> RS2 takes two handlers, one =
from=20
> RS1 and one from RS2? If that is true, then I understand that it is a=20=

> typical Dining philosophers problem.
>=20
> Maybe a random yielding mechanism can solve this problem.
> Best regards,
> Wei
>=20
> ---------------------------------
> Wei Tan, PhD
> Research Staff Member
> IBM T. J. Watson Research Center
> http://researcher.ibm.com/person/us-wtan
>=20
>=20
>=20
> From:   Vladimir Rodionov <vrodionov@carrieriq.com>
> To:     "dev@hbase.apache.org" <dev@hbase.apache.org>,=20
> Date:   01/22/2014 12:09 PM
> Subject:        RE: Design review: Secondary index support through=20
> coprocess
>=20
>=20
>=20
> Deadlocks are possible because  cross region RPCs create cyclic=20
> dependencies in HBase cluster.
>=20
> RS1-> RS2->RS3->RS1, where -> is PRC call
>=20
> now imagine that last call from RS3 to RS1 is blocked because there no=20=

> more available handler threads to process it.
>=20
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: vrodionov@carrieriq.com
>=20
> ________________________________________
> From: Wei Tan [wtan@us.ibm.com]
> Sent: Wednesday, January 22, 2014 7:51 AM
> To: dev@hbase.apache.org
> Subject: RE: Design review: Secondary index support through coprocess
>=20
> Why cross-RS RPC is going to cause deadlocks? It is a matter of logic
> incorrectness, or resource outage? Say, if we set the #handler to be
> large, logically deadlock still occurs?
> Best regards,
> Wei
>=20
>=20
>=20
>=20
> From:   Vladimir Rodionov <vrodionov@carrieriq.com>
> To:     "dev@hbase.apache.org" <dev@hbase.apache.org>,
> Date:   01/20/2014 03:00 PM
> Subject:        RE: Design review: Secondary index support through
> coprocess
>=20
>=20
>=20
>>> Yes, the coprocessors potentially cross RS boundaries.
>=20
> The open path to the disaster. Inter region RPCs in coprocessors may
> result in periodic cluster - wide deadlocks
>=20
>=20
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: vrodionov@carrieriq.com
>=20
> ________________________________________
> From: James Taylor [jtaylor@salesforce.com]
> Sent: Monday, January 20, 2014 11:39 AM
> To: dev@hbase.apache.org
> Subject: Re: Design review: Secondary index support through coprocess
>=20
> Yes, the coprocessors potentially cross RS boundaries. No, the index =
is
> not
> co-located with the main table. Take a look at the link I sent as that
> should be able to answer a lot of questions.
>=20
> Thanks,
> James
>=20
>=20
> On Mon, Jan 20, 2014 at 11:03 AM, Michael Segel
> <michael_segel@hotmail.com>wrote:
>=20
>> James,
>>=20
>> Ok=85
>>=20
>> Its been a while since we talked about this=85
>>=20
>> While the index is in a separate table, is that table being split and
>> collocated with the main table?
>>=20
>> If you=92re using the coprocessor to maintain the index, that would =
imply
>> you=92re crossing RS boundaries if your index is truly orthogonal.
>>=20
>> Is this what you=92re doing?
>>=20
>> On Jan 20, 2014, at 11:32 AM, James Taylor <jtaylor@salesforce.com>
> wrote:
>>=20
>>> Mike,
>>> Yes, you're mistaken:
>>> - secondary indexes in Phoenix are orthogonal to the base table.
> They're
>> in
>>> a separate table (
>>> http://phoenix.incubator.apache.org/secondary_indexing.html).
>>> - Phoenix has joins. They're in our master branch with a release
>> scheduled
>>> for next month
>>> - numeric strings? Not a use case for indexing numeric data? Have =
you
>> ever
>>> seen a number used as an ID?
>>> Thanks,
>>> James
>>>=20
>>>=20
>>> On Mon, Jan 20, 2014 at 8:50 AM, Michael Segel <
>> michael_segel@hotmail.com>wrote:
>>>=20
>>>> Indexes tend to be orthogonal to the base table, not to mention if
>> you=92re
>>>> using an inverted table for an index, your index table would be =
much
>>>> thinner than your base table.
>>>>=20
>>>> Having said that, the solution proposed by Yu, Taylor and others =
only
>>>> works if you want to use the index to help on server side filtering
> and
>>>> misses the boat on the larger and broader picture of improving =
query
>>>> optimization and joins.
>>>>=20
>>>> HINT: Unless I am mistaken=85 until you treat the index as =
orthogonal
> to
>> the
>>>> base table, you will always lag performance of traditional MPP DWs
> like
>>>> Informix XPS. (Now part of IBM=92s IM pillar )
>>>>=20
>>>> In addition, until you fix coprocessors in general, you will have
>>>> scalability and performance issues.
>>>> (Note that you can write a coprocessor to create a sandbox and
> separate
>>>> the co-process from the RS jvm, however it would be better if it =
were
>> part
>>>> of the underlying coprocessor code. )
>>>>=20
>>>> The current implementation makes joins worthless.
>>>> (Note that in prior discussions,  Phoenix doesn=92t do joins=85)
>>>> Here=92s why:
>>>> In order to do a join, if you use the proposed index, you have to
> first
>>>> reduce each index in to a single, sort ordered set.  Then you can
> take
>> the
>>>> intersection of the index result sets.  The final set would be in
> sort
>>>> order and a subset of the total rows. You can then fetch the rows =
and
>> still
>>>> do a server side filter before returning the ultimate result set.
>>>>=20
>>>> Its that first step of reducing each result set in to a single sort
>>>> ordered set that takes a lot of effort.
>>>>=20
>>>>=20
>>>> On a side note=85. there=92s been some mention of ordering floats. =
Again,
>> just
>>>> a word of caution=85 there isn=92t a really strong use case for =
indexing
>>>> numeric data types. period.  And to be very, very clear, there is a
>>>> distinction between numeric strings and numeric data types.
>>>>=20
>>>> -Mike
>>>>=20
>>>> PS. Because of my role as a consultant, I am very, very limited in
> what
>> I
>>>> can say and contribute. I don=92t own my work product, my clients =
do.
> Take
>>>> what I say with a grain of salt.  I=92m just a skinny little boy =
from
>>>> Cleveland Ohio, come to chase your beers and drink your women=85 =
;-)
>>>>=20
>>>> On Jan 9, 2014, at 10:48 AM, James Taylor <jtaylor@salesforce.com>
>> wrote:
>>>>=20
>>>>> IMHO, it would be valuable if the design considered both a global
>>>>> indexing solution and a local indexing solution. Both are useful =
in
>>>>> different circumstances. The global indexing design plus the
>>>>> application integration points could be derived from Jesse's work
> with
>>>>> his reference implementation in Phoenix - the global indexing code
> has
>>>>> no Phoenix dependencies and clearly defined integration points.
>>>>>=20
>>>>> Thanks,
>>>>> James
>>>>>=20
>>>>> On Jan 9, 2014, at 6:36 AM, Jesse Yates <jesse.k.yates@gmail.com>
>> wrote:
>>>>>=20
>>>>>> Yes, that was a big concern I had as well.
>>>>>>=20
>>>>>> It's not clear how that will work with a large number of indexes;
> if
>>>> people
>>>>>> have one index, they will want more than one. To not plan for =
that
>> seems
>>>>>> like an incomplete implementation to me. In a horizontally =
scalable
>>>> system
>>>>>> like HBase, lots of buddy region isn't going to work out well..*
> Once
>> we
>>>>>> have regions that cannot be collocated, the extra RPC time starts
> to
>> be
>>>> the
>>>>>> biggest factor (as the doc points out) and we are back to what
> Phoenix
>>>> is
>>>>>> already doing**.
>>>>>>=20
>>>>>> But I'm probably missing something here in what makes it =
different?
>>>>>>=20
>>>>>> For folks that haven't been following the issue some high-level
> "how
>> it
>>>> all
>>>>>> kinda works" would be helpful from the championing commiters;
> that's a
>>>> long
>>>>>> doc to get through and grok :). How similar is this to the work
>>>> currently
>>>>>> by the existing indexing implementations (huawei, Phoenix, =
ngdata)?
>> The
>>>> doc
>>>>>> doesn't really nail down the interactions, but instead just right
> in
>>>> after
>>>>>> describing why SI should be added.
>>>>>>=20
>>>>>> Agree this would be super useful, but don't want to waste too =
much
>> work
>>>>>> reinventing the wheel or doing the wrong thing. further, this =
impl
>>>> quickly
>>>>>> starts to lead down the query optimization path, which get HBase
> away
>>>> from
>>>>>> its core "be a great byte store".
>>>>>>=20
>>>>>> Like I said, I'm all for secondary indexes in HBase and think =
this
> is
>> a
>>>>>> great push. I don't mean to rain on any parades.
>>>>>>=20
>>>>>> - jesse
>>>>>>=20
>>>>>> * but a smart way to specify region collocation? That I can get
> behind
>>>> as
>>>>>> it would unify a couple different indexing impls (e.g Phoenix =
would
>>>>>> consider using it to help make indexing faster - RPCs do suck).
>>>>>>=20
>>>>>> ** for instance, the doc talks about how to implement indexing =
for
>>>>>> floats... That might be a default impl, but for use cases like
> Phoenix
>>>> this
>>>>>> would break all our current encodings. We handled this is the
> indexing
>>>> impl
>>>>>> by making the builder pluggable for different use cases to =
support
>>>>>> different encodings. I feel like a lot of the code for this kind =
of
> SI
>>>>>> impl is already in Phoenix and has been working and fast for
> several
>>>> months
>>>>>> now; it's surprisingly tricky, especially with the delete cases =
and
>> time
>>>>>> stamp manipulation issues.
>>>>>>=20
>>>>>>=20
>>>>>> On Thursday, January 9, 2014, Sudarshan Kadambi (BLOOMBERG/ 731
> LEXIN)
>>>>>> wrote:
>>>>>>=20
>>>>>>> Could you explain how the 1-1 association between user and index
>> table
>>>>>>> regions is maintained. I wasn't able to understand fully from =
the
>>>> document.
>>>>>>>=20
>>>>>>> ----- Original Message -----
>>>>>>> From: Ted Yu <dev@hbase.apache.org>
>>>>>>> To: dev@hbase.apache.org
>>>>>>> At: Jan 8, 2014 3:41:40 PM
>>>>>>>=20
>>>>>>> Hi,
>>>>>>> Secondary index support is a frequently requested feature.
>>>>>>>=20
>>>>>>> Please find the updated design doc here:
>>>>>>>=20
>>>>>>>=20
>>>>=20
>>=20
> =
https://issues.apache.org/jira/secure/attachment/12621909/SecondaryIndex%2=
0Design_Updated_2.pdf
>=20
>=20
>>>>>>>=20
>>>>>>> HBASE-9203 is the umbrella JIRA.
>>>>>>>=20
>>>>>>> Implementation patch was attached to HBASE-10222
>>>>>>>=20
>>>>>>> Thanks to Rajesh who works on this feature.
>>>>>>>=20
>>>>>>> Cheers
>>>>>>>=20
>>>>>>=20
>>>>>>=20
>>>>>> --
>>>>>> -------------------
>>>>>> Jesse Yates
>>>>>> @jesse_yates
>>>>>> jyates.github.com
>>>>>=20
>>>>=20
>>>>=20
>>=20
>>=20
>=20
> Confidentiality Notice:  The information contained in this message,
> including any attachments hereto, may be confidential and is intended =
to
> be read only by the individual or entity to whom this message is
> addressed. If the reader of this message is not the intended recipient =
or
> an agent or designee of the intended recipient, please note that any
> review, use, disclosure or distribution of this message or its
> attachments, in any form, is strictly prohibited.  If you have =
received
> this message in error, please immediately notify the sender and/or
> Notifications@carrieriq.com and delete or destroy any copy of this =
message
> and its attachments.
>=20
>=20
>=20
>=20
> Confidentiality Notice:  The information contained in this message,=20
> including any attachments hereto, may be confidential and is intended =
to=20
> be read only by the individual or entity to whom this message is=20
> addressed. If the reader of this message is not the intended recipient =
or=20
> an agent or designee of the intended recipient, please note that any=20=

> review, use, disclosure or distribution of this message or its=20
> attachments, in any form, is strictly prohibited.  If you have =
received=20
> this message in error, please immediately notify the sender and/or=20
> Notifications@carrieriq.com and delete or destroy any copy of this =
message=20
> and its attachments.
>=20
>=20
>=20

The opinions expressed here are mine, while they may reflect a cognitive =
thought, that is purely accidental.=20
Use at your own risk.=20
Michael Segel
michael_segel (AT) hotmail.com