hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From anil gupta <anilgupt...@gmail.com>
Subject Re: Coprocessor Increments
Date Tue, 15 Oct 2013 03:57:36 GMT
Hi Ted,

Sure, I would like to revive it. My bad that i didnt wrap up the patch. I
am also in the middle of making this coprocessor handle "nulls first" and
"nulls last" clause.  I am targeting to do that in a month or so. Thanks
for reminding me.

~Anil



On Mon, Oct 14, 2013 at 3:34 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> Anil:
> bq. We also use CP's wherever they are appropriate(like HBASE-7474).
>
> HBASE-7474 has been dormant for several months. Do you want to revive it ?
>
> Cheers
>
>
> On Mon, Oct 14, 2013 at 3:25 PM, anil gupta <anilgupta84@gmail.com> wrote:
>
> > Inline.
> >
> >
> > On Mon, Oct 14, 2013 at 7:50 AM, Michael Segel <
> msegel_hadoop@hotmail.com
> > >wrote:
> >
> > > Anil,
> > >
> > > I wasn't suggesting that you can't do what you're doing, but you end up
> > > running in to the risks which coprocessors are supposed to remove. The
> > > standard YMMV always applies.
> > >
> > Agree with you. But, as per my knowledge and experience with
> coprocessors,
> > they are meant to be used for operations that are local to RS. Otherwise,
> > you are in danger of running into deadlocks, scalability issues.
> >
> > >
> > > You have a cluster… another team in your company wants to use the
> > cluster.
> > > So instead of the cluster being a single resource for your app/team, it
> > now
> > > becomes a shared resource. So now you have people accessing HBase for
> > > multiple apps.
> > >
> > Well, its a separation of responsibility in this case. We don't want
> teams
> > to step each other toes and at the same time work well as an ecosystem.
> > Rule: Other teams can use same cluster. But they cannot write directly
> into
> > the tables that we own/control.  If they want to write into our tables
> then
> > they have to use our HBase Client.
> >
> > >
> > > You could then run multiple HBase HMasters with different locations for
> > > files, however… this can get messy.
> > > HOYA seems to suggest this as the future.  If so, then you have to
> wonder
> > > about data locality.
> > >
> > HOYA is not even in beta at present. So, right now we are not thinking
> > about it.
> >
> > >
> > > Having your app update the primary table and then the secondary index
> is
> > > always a good fallback, however you need to ensure that you understand
> > the
> > > risks.
> > >
> > Agree, i understand that there is risk. But, you have to bite the bullet
> > when you are doing something that is not supported out of the box.  We
> also
> > use CP's wherever they are appropriate(like HBASE-7474).
> >
> > >
> > > With respect to secondary indexes… if you decouple the writes… you can
> > get
> > > better throughput. Note that the code becomes a bit more complex
> because
> > > you're going to have to introduce a couple of different things.  But
> > thats
> > > something for a different discussion…
> > >
> > Whether to use CP or not, depends on the use case. In my opinion, CP's
> are
> > really powerful and an awesome feature in HBase. But, sometimes if not
> used
> > properly(like creating a Cyclic Graph as per Tom's example), they might
> be
> > problematic.
> >
> >
> > >
> > > On Oct 13, 2013, at 10:15 AM, anil gupta <anilgupta84@gmail.com>
> wrote:
> > >
> > > > Inline.
> > > >
> > > > On Sun, Oct 13, 2013 at 6:02 AM, Michael Segel <
> > > msegel_hadoop@hotmail.com>wrote:
> > > >
> > > >> Ok…
> > > >>
> > > >> Sure you can have your app update the secondary index table.
> > > >> The only issue with that is if someone updates the base table
> outside
> > of
> > > >> your app,
> > > >> they may or may not increment the secondary index.
> > > >>
> > > > Anil: We dont allow people to write data into HBase from their own
> > HBase
> > > > client. We control the writes into HBase. So, we dont have the
> problem
> > of
> > > > secondary index not getting written.
> > > > For example, If you expose a restful web service you can easily
> control
> > > the
> > > > writes to HBase. Even, if user requests to write one row in "main
> > table",
> > > > you application can have the logic to writing in "Secondary index"
> > > tables.
> > > > In this way, it is transparent to users also. You can add/remove
> > seconday
> > > > indexes as you want.
> > > >
> > > >> Note that your secondary index doesn't have to be an inverted table,
> > but
> > > >> could be SOLR, LUCENE or something else.
> > > >>
> > > > Anil:As of now, we are happy with Inverted tables as they fit to our
> > use
> > > > case.
> > > >
> > > >>
> > > >> So you really want to secondary indexes on the server.
> > > >>
> > > >> There are a couple of things that could improve the performance,
> > > although
> > > >> the write to the secondary index would most likely lag under heavy
> > load.
> > > >>
> > > >>
> > > >> On Oct 12, 2013, at 11:27 PM, anil gupta <anilgupta84@gmail.com>
> > wrote:
> > > >>
> > > >>> John,
> > > >>>
> > > >>> My 2 cents:
> > > >>> I tried implementing Secondary Index by using Region Observers
on
> > Put.
> > > It
> > > >>> works well under low load. But, under heavy load the RO could
not
> > keep
> > > up
> > > >>> with load cross region server writes.
> > > >>> Then, i decided not to use RO as per Andrew's explanation and
 I
> > moved
> > > >> all
> > > >>> the logic of building secondary index tables on my HBase Client
.
> > Since
> > > >>> then, the system has been running fine under heavy load.
> > > >>> IMO, if you will use RO and do cross RS read/write then perhaps
> this
> > > will
> > > >>> become your bottleneck in HBase.
> > > >>> Is it possible for you to avoid RO and control the writes/updates
> > from
> > > >>> client side?
> > > >>>
> > > >>> Thanks,
> > > >>> Anil Gupta
> > > >>>
> > > >>>
> > > >>> On Fri, Oct 11, 2013 at 6:06 PM, John Weatherford <
> > > >>> john.weatherford@telescope.tv> wrote:
> > > >>>
> > > >>>> OP Here :)
> > > >>>>
> > > >>>> Our current design involves a Region Observer on a table that
does
> > > >>>> increments on a second table. We took the approach that Michael
> said
> > > and
> > > >>>> inside the RO, we got a new connection and everything. We
believe
> > this
> > > >> is
> > > >>>> causing deadlocks for us. Our next attempt is going to be
writing
> to
> > > >>>> another row in the same table where we will store the increments.
> If
> > > >> this
> > > >>>> doesn't work, we are going to simply pull the increments out
of
> the
> > RO
> > > >> and
> > > >>>> do them in the application or in Flume.
> > > >>>>
> > > >>>> @Tom Brown
> > > >>>> I would be very interested to hear more about your solution
of
> > > >>>> aggregating the increments in another system that is then
> > responsible
> > > >> for
> > > >>>> updating in Hbase.
> > > >>>>
> > > >>>> -jW
> > > >>>>
> > > >>>>
> > > >>>> On Fri 11 Oct 2013 10:26:58 AM PDT, Vladimir Rodionov wrote:
> > > >>>>
> > > >>>>> With respect to the OP's design… does the deadlock occur
because
> > he's
> > > >>>>>>> trying to update a column in a different row within
the same
> > table?
> > > >>>>>>>
> > > >>>>>>
> > > >>>>> Because he is trying to update *row* in a different Region
(and
> > > >>>>> potentially in different RS).
> > > >>>>>
> > > >>>>> Best regards,
> > > >>>>> Vladimir Rodionov
> > > >>>>> Principal Platform Engineer
> > > >>>>> Carrier IQ, www.carrieriq.com
> > > >>>>> e-mail: vrodionov@carrieriq.com
> > > >>>>>
> > > >>>>> ______________________________**__________
> > > >>>>> From: Michael Segel [msegel_hadoop@hotmail.com]
> > > >>>>> Sent: Friday, October 11, 2013 9:10 AM
> > > >>>>> To: user@hbase.apache.org
> > > >>>>> Cc: Vladimir Rodionov
> > > >>>>> Subject: Re: Coprocessor Increments
> > > >>>>>
> > > >>>>>
> > > >>>>> Confidentiality Notice:  The information contained in
this
> message,
> > > >>>>> including any attachments hereto, may be confidential
and is
> > intended
> > > >> to be
> > > >>>>> read only by the individual or entity to whom this message
is
> > > >> addressed. If
> > > >>>>> the reader of this message is not the intended recipient
or an
> > agent
> > > or
> > > >>>>> designee of the intended recipient, please note that any
review,
> > use,
> > > >>>>> disclosure or distribution of this message or its attachments,
in
> > any
> > > >> form,
> > > >>>>> is strictly prohibited.  If you have received this message
in
> > error,
> > > >> please
> > > >>>>> immediately notify the sender and/or
> Notifications@carrieriq.comand
> > > >>>>> delete or destroy any copy of this message and its attachments.
> > > >>>>>
> > > >>>>
> > > >>>
> > > >>>
> > > >>> --
> > > >>> Thanks & Regards,
> > > >>> Anil Gupta
> > > >>
> > > >>
> > > >
> > > >
> > > > --
> > > > Thanks & Regards,
> > > > Anil Gupta
> > >
> > >
> >
> >
> > --
> > Thanks & Regards,
> > Anil Gupta
> >
>



-- 
Thanks & Regards,
Anil Gupta

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message