hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anoop John <anoop.hb...@gmail.com>
Subject Re: How can I achieve HBase row level atomicity?
Date Mon, 07 Nov 2016 18:22:46 GMT
So what is the update op doing?  Say the cur available value in table
is 'x' and incoming data from ETL is 'y'  what is the updated value u
try put?  Is it an increment by 'y"? Or some complex logic behind?

-Anoop-

On Mon, Nov 7, 2016 at 11:09 PM, Manjeet Singh
<manjeet.chandhok@gmail.com> wrote:
> Yes Anoop you are right
> my input source is kafka pipeline we have 7 spark ETL jobs which are
> responsible for aggregation, and to get last value from Hbase and put the
> updated one.
> we set zero version so we always have only one copy in Hbase.
> Problem is if my 2 different ETL job having the same rowkey and one ETL
> process get the value and at same time second also get that value and
> update it now first ETL job will replace that updated value.
>
> it can happen in same ETL job too.
>
> Thanks
> Manjeet
>
> On Mon, Nov 7, 2016 at 7:07 PM, Anoop John <anoop.hbase@gmail.com> wrote:
>
>> Seems u want to get an already present row and do some op and put the
>> updated value.  What is that op?  If u can explain those we can try
>> help u with ways (if available)  As such, the above kind of API do not
>> guarantee u any atomicity.
>>
>> -Anoop-
>>
>> On Fri, Nov 4, 2016 at 4:12 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>> > bq. from some api its going for update (means get is performed)
>> >
>> > Update on hbase would correlate with Put or Delete (not sure what 'get'
>> > above means).
>> >
>> > Looks like your concern is that two concurrent updates may overwrite the
>> > data for same rowkey.
>> >
>> > Have you considered using:
>> >
>> >   public boolean checkAndPut(final byte [] row,
>> >
>> >       final byte [] family, final byte [] qualifier, final byte [] value,
>> >
>> >       final Put put)
>> >
>> > Cheers
>> >
>> > On Thu, Nov 3, 2016 at 10:57 AM, Manjeet Singh <
>> manjeet.chandhok@gmail.com>
>> > wrote:
>> >
>> >> Hi Ted,
>> >>
>> >> code not required for this case and how MVCC can help?
>> >>
>> >> question is if I have record which come in hbase and from some api its
>> >> going for update (means get is performed)
>> >> now other operation also performed get operation both update on same
>> rwokey
>> >> at end we will not have correct value.
>> >>
>> >> in seprate mail thread i asked same type of question regarding rowkey
>> lock,
>> >> but still i did't get correct anser
>> >>
>> >> Thanks
>> >> Manjeet
>> >>
>> >> On Wed, Nov 2, 2016 at 11:36 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>> >>
>> >> > Were you including code in the image (which didn't come through) ?
>> >> >
>> >> > MultiVersionConcurrencyControl is involved in answering your question.
>> >> >
>> >> > See http://hbase.apache.org/book.html#hregion.scans
>> >> >
>> >> > Cheers
>> >> >
>> >> > On Wed, Nov 2, 2016 at 10:57 AM, Manjeet Singh <
>> >> manjeet.chandhok@gmail.com
>> >> > >
>> >> > wrote:
>> >> >
>> >> > > Hi All
>> >> > >
>> >> > > I have ETL process for inserting data into hbase for this I have
>> spark
>> >> > > jobs which are responsible for reading data from kafka topics.
so my
>> >> > > question is if I have some rowkey already exist in hbase and I
have
>> 3
>> >> > spark
>> >> > > job running and they all try to update on same rowkey how hbase
deal
>> >> for
>> >> > > atomicity?
>> >> > >
>> >> > > for more understanding i have 3 rowkey coming from 3 seprate spark
>> job
>> >> > and
>> >> > > all trying to update same rowkey which is already exist in hbase
>> table.
>> >> > >
>> >> > > [image: Inline image 1]
>> >> > >
>> >> > > --
>> >> > > luv all
>> >> > >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> luv all
>> >>
>>
>
>
>
> --
> luv all

Mime
View raw message