hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: Add client complexity or use a coprocessor?
Date Fri, 13 Apr 2012 22:32:55 GMT
I would look first at how concurrent is your coprocessor in operation. There's been quite a
bit of effort to make upserts (increments), and the MemStore in general, efficient at high
concurrency.


> Is the table auto-flush option the same as manually batching all the updates?

I think the answer to your question is yes. Setting HTable.setAutoFlush(false) will buffer
Puts (only) until the write buffer is full or a call to HTable.flushCommits(). So that would
be like manually batching a bunch of Puts. 

Best regards,


    - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)



----- Original Message -----
> From: Tom Brown <tombrown52@gmail.com>
> To: user@hbase.apache.org; Andrew Purtell <apurtell@apache.org>
> Cc: 
> Sent: Thursday, April 12, 2012 1:37 AM
> Subject: Re: Add client complexity or use a coprocessor?
> 
> Andy,
> 
> Is the table auto-flush option the same as manually batching all the updates?
> 
> --Tom
> 
> On Tue, Apr 10, 2012 at 5:53 PM, Andrew Purtell <apurtell@apache.org> 
> wrote:
>>>  Even my implementation of an atomic increment
>>>  (using a coprocessor) is two orders of magnitude slower than the
>>>  provided implementation.  Are there properties inherent to
>>>  coprocessors or Incrementors that would force this kind of performance
>>>  difference?
>> 
>> 
>>  No.
>> 
>> 
>>  You may be seeing a performance difference if you are packing multiple 
> Increments into one round trip but not doing a similar kind of batching if 
> calling a custom endpoint. Each Endpoint invocation is a round trip unless you 
> do something like:
>> 
>>      List<Row> actions = new ArrayList<Row>();    
> actions.add(new Exec(conf, row, protocol, method, ...));
>> 
>>      actions.add(new Exec(conf, row, protocol, method, ...));
>> 
>>      actions.add(new Exec(conf, row, protocol, method, ...));
>> 
>>      Object[] results = table.batch(actions);
>>      ...
>> 
>> 
>>  I've not personally tried that particular API combination but don't 
> see why it would not be possible.
>> 
>> 
>>  Beyond that, I'd suggest running a regionserver with your coprocessor 
> installed under a profiler to see if you have monitor contention or a hotspot or 
> similar. It could be something unexpected.
>> 
>> 
>>>  Can you think of an efficient way to implement an atomic bitfield
>>>  (other than adding it as a separate feature like atomic increments)?
>> 
>>  I think the idea of an atomic bitfield operation as part of the core API is 
> intriguing. It has applicability to your estimator use case and I can think of a 
> couple of things I could use it for. If there is more support for this idea, 
> this may be something to consider.
>> 
>> 
>>  Best regards,
>> 
>> 
>>      - Andy
>> 
>>  Problems worthy of attack prove their worth by hitting back. - Piet Hein 
> (via Tom White)
>> 
>> 
>> 
>>  ----- Original Message -----
>>>  From: Tom Brown <tombrown52@gmail.com>
>>>  To: user@hbase.apache.org; Andrew Purtell <apurtell@apache.org>
>>>  Cc:
>>>  Sent: Tuesday, April 10, 2012 3:53 PM
>>>  Subject: Re: Add client complexity or use a coprocessor?
>>> 
>>>  Andy,
>>> 
>>>  I have attempted to use coprocessors to achieve a passable performance
>>>  but have failed so far. Even my implementation of an atomic increment
>>>  (using a coprocessor) is two orders of magnitude slower than the
>>>  provided implementation.  Are there properties inherent to
>>>  coprocessors or Incrementors that would force this kind of performance
>>>  difference?
>>> 
>>>  Can you think of an efficient way to implement an atomic bitfield
>>>  (other than adding it as a separate feature like atomic increments)?
>>> 
>>>  Thanks!
>>> 
>>>  --Tom
>>> 
>>>  On Tue, Apr 10, 2012 at 12:01 PM, Andrew Purtell 
> <apurtell@apache.org>
>>>  wrote:
>>>>   Tom,
>>>>>   I am a big fan of the Increment class. Unfortunately, I'm 
> not doing
>>>>>   simple increments for the viewer count. I will be receiving 
> duplicate
>>>>>   messages from a particular client for a specific cube cell, 
> and
>>>  don't
>>>>>   want them to be counted twice
>>>> 
>>>>   Gotcha.
>>>> 
>>>>>   I created an RPC endpoint coprocessor to perform this function 
> but
>>>>>   performance suffered heavily under load (it appears that the 
> endpoint
>>>>>   performs all functions in serial).
>>>> 
>>>>   Did you serialize access to your data structure(s)?
>>>> 
>>>>>   When I tried implementing it as a region observer, I was 
> unsure of how
>>>>>   to correctly replace the provided "put" with my own. 
> When I
>>>  issued a
>>>>>   put from within "prePut", the server blocked the new 
> put
>>>  (waiting for
>>>>>   the "prePut" to finish). Should I be attempting to 
> modify the
>>>  WALEdit
>>>>>   object?
>>>> 
>>>>   You can add KVs to the WALEdit. Or, you can get a reference to the
>>>  Put's familyMap:
>>>> 
>>>>       Map<byte[], List<KeyValue>> familyMap = 
> put.getFamilyMap();
>>>> 
>>>>   and if you modify the map, you'll change what gets committed.
>>>> 
>>>>>   Is there a way to extend the functionality of 
> "Increment" to
>>>  provide
>>>>>   arbitrary bitwise operations on a the contents of a field?
>>>> 
>>>>   As a matter of design, this should be a new operation. It does 
> sound
>>>  interesting and useful, some sort of atomic bitfield.
>>>> 
>>>> 
>>>>   Best regards,
>>>> 
>>>>       - Andy
>>>> 
>>>>   Problems worthy of attack prove their worth by hitting back. - 
> Piet Hein
>>>  (via Tom White)
>>>> 
>>>> 
>>>> 
>>>>   ----- Original Message -----
>>>>>   From: Tom Brown <tombrown52@gmail.com>
>>>>>   To: user@hbase.apache.org
>>>>>   Cc:
>>>>>   Sent: Monday, April 9, 2012 10:14 PM
>>>>>   Subject: Re: Add client complexity or use a coprocessor?
>>>>> 
>>>>>   Andy,
>>>>> 
>>>>>   I am a big fan of the Increment class. Unfortunately, I'm 
> not doing
>>>>>   simple increments for the viewer count. I will be receiving 
> duplicate
>>>>>   messages from a particular client for a specific cube cell, 
> and
>>>  don't
>>>>>   want them to be counted twice (my stats don't have to be 
> 100%
>>>>>   accurate, but the expected rate of duplicates will be higher 
> than the
>>>>>   allowable error rate).
>>>>> 
>>>>>   I created an RPC endpoint coprocessor to perform this function 
> but
>>>>>   performance suffered heavily under load (it appears that the 
> endpoint
>>>>>   performs all functions in serial).
>>>>> 
>>>>>   When I tried implementing it as a region observer, I was 
> unsure of how
>>>>>   to correctly replace the provided "put" with my own. 
> When I
>>>  issued a
>>>>>   put from within "prePut", the server blocked the new 
> put
>>>  (waiting for
>>>>>   the "prePut" to finish). Should I be attempting to 
> modify the
>>>  WALEdit
>>>>>   object?
>>>>> 
>>>>>   Is there a way to extend the functionality of 
> "Increment" to
>>>  provide
>>>>>   arbitrary bitwise operations on a the contents of a field?
>>>>> 
>>>>>   Thanks again!
>>>>> 
>>>>>   --Tom
>>>>> 
>>>>>>   If it helps, yes this is possible:
>>>>>> 
>>>>>>>    Can I observe updates to a
>>>>>>>    particular table and replace the provided data with 
> my own?
>>>  (The
>>>>>>>    client calls "put" with the actual user ID, 
> my
>>>  co-processor
>>>>>   replaces
>>>>>>>    it with a computed value, so the actual user ID never 
> gets
>>>  stored in
>>>>>>>    HBase).
>>>>>> 
>>>>>>   Since your option #2 requires atomic updates to the data 
> structure,
>>>  have you
>>>>>   considered native
>>>>>>   atomic increments? See
>>>>>> 
>>>>>> 
>>> 
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#incrementColumnValue%28byte[],%20byte[],%20byte[],%20long,%20boolean%29
>>>>>> 
>>>>>> 
>>>>>>   or
>>>>>> 
>>>>>> 
>>> 
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Increment.html
>>>>>> 
>>>>>>   The former is a round trip for each value update. The 
> latter allows
>>>  you to
>>>>>   pack multiple updates
>>>>>>   into a single round trip. This would give you accurate 
> counts even
>>>  with
>>>>>   concurrent writers.
>>>>>> 
>>>>>>   It should be possible for you to do partial aggregation on 
> the
>>>  client side
>>>>>   too whenever parallel
>>>>>>   requests colocate multiple updates to the same cube within 
> some
>>>  small window
>>>>>   of time.
>>>>>> 
>>>>>>   Best regards,
>>>>>> 
>>>>>> 
>>>>>>       - Andy
>>>>>> 
>>>>>>   Problems worthy of attack prove their worth by hitting 
> back. - Piet
>>>  Hein
>>>>>   (via Tom White)
>>>>>> 
>>>>>>   ----- Original Message -----
>>>>>>>    From: Tom Brown <tombrown52@gmail.com>
>>>>>>>    To: user@hbase.apache.org
>>>>>>>    Cc:
>>>>>>>    Sent: Monday, April 9, 2012 9:48 AM
>>>>>>>    Subject: Add client complexity or use a coprocessor?
>>>>>>> 
>>>>>>>    To whom it may concern,
>>>>>>> 
>>>>>>>    Ignoring the complexities of gathering the data, 
> assume that I
>>>  will be
>>>>>>>    tracking millions of unique viewers. Updates from 
> each of our
>>>  millions
>>>>>>>    of clients are gathered in a centralized platform and 
> spread
>>>  among a
>>>>>>>    group of machines for processing and inserting into 
> HBase
>>>  (assume that
>>>>>>>    this group can be scaled horizontally). The data is 
> stored in
>>>  an OLAP
>>>>>>>    cube format and one of the metrics I'm tracking 
> across
>>>  various
>>>>>>>    attributes is viewership (how many people from Y are 
> watching
>>>  X).
>>>>>>> 
>>>>>>>    I'm writing this to ask for your thoughts as to 
> the most
>>>>>   appropriate
>>>>>>>    way to structure my data so I can count unique TV 
> viewers
>>>  (assume a
>>>>>>>    service like netflix or hulu).
>>>>>>> 
>>>>>>>    Here are the solutions I'm considering:
>>>>>>> 
>>>>>>>    1. Store each unique user ID as the cell name within 
> the
>>>  cube(s) it
>>>>>>>    occurs. This has the advantage of having 100% 
> accuracy, but
>>>  the
>>>>>>>    downside is the enormous space required to store each 
> unique
>>>  cell.
>>>>>>>    Consuming this data is also problematic as the only 
> way to
>>>  provide a
>>>>>>>    viewership count is by counting each cell. To save 
> the
>>>  overhead of
>>>>>>>    sending each cell over the network, counting them 
> could be
>>>  done by a
>>>>>>>    coprocessor on the region server, but that still 
> doesn't
>>>  avoid the
>>>>>>>    overhead of reading each cell from the disk. I'm 
> also not
>>>  sure what
>>>>>>>    happens if a single row is larger than an entire 
> region (48
>>>  bytes per
>>>>>>>    user ID * 10,000,000 users = 480GB).
>>>>>>> 
>>>>>>>    2. Store a byte array that allows estimating unique 
> viewers
>>>  (with a
>>>>>>>    small margin of error*). Add a co-processor for 
> updating this
>>>  column
>>>>>>>    so I can guarantee the updates to a specific OLAP 
> cell will be
>>>  atomic.
>>>>>>>    The main benefit from this path is that there the 
> nodes that
>>>  update
>>>>>>>    HBase can be less complex. Another benefit I see is 
> that the I
>>>  can
>>>>>>>    just add more HBase regions as scale requires. 
> However,
>>>  I'm not
>>>>>   sure
>>>>>>>    if I can use a coprocessor the way I want; Can I 
> observe
>>>  updates to a
>>>>>>>    particular table and replace the provided data with 
> my own?
>>>  (The
>>>>>>>    client calls "put" with the actual user ID, 
> my
>>>  co-processor
>>>>>   replaces
>>>>>>>    it with a computed value, so the actual user ID never 
> gets
>>>  stored in
>>>>>>>    HBase).
>>>>>>> 
>>>>>>>    3. Store a byte array that allows estimating unique 
> viewers
>>>  (with a
>>>>>>>    small margin of error*). Re-arrange my architecture 
> so that
>>>  each OLAP
>>>>>>>    cell is only updated by a single node. The main 
> benefit from
>>>  this
>>>>>>>    would be that I don't need to worry about atomic
>>>  operations in
>>>>>   HBase
>>>>>>>    since all updates for a single cell will be atomic 
> and in
>>>  serial. The
>>>>>>>    biggest downside is that I believe it will add 
> significant
>>>  complexity
>>>>>>>    to my overall architecture.
>>>>>>> 
>>>>>>> 
>>>>>>>    Thanks for your time, and I look forward to hearing 
> your
>>>  thoughts.
>>>>>>> 
>>>>>>>    Sincerely,
>>>>>>>    Tom Brown
>>>>>>> 
>>>>>>>    *(For information about the byte array mentioned in 
> #2 and #3,
>>>  see:
>>>>>>> 
>>>>> 
>>> 
> http://highscalability.com/blog/2012/4/5/big-data-counting-how-to-count-a-billion-distinct-objects-us.html)
>>>>>>> 
>>>>> 
>>> 
> 

Mime
View raw message