hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Weatherford <john.weatherf...@telescope.tv>
Subject Coprocessor Increments
Date Thu, 10 Oct 2013 01:43:21 GMT
Hi All,
   We have been running into an RPC deadlock issue on HBase and from 
investigation, we believe the root of the issue is in us doing cross 
region increments from a coprocessor. After some further searching and 
reading over this 
<http://mail-archives.apache.org/mod_mbox/hbase-user/201212.mbox/%3CCA+RK=_BP8k1Z-gQ+38RiipKgzi+=5Cn3EkZDJZ_Z-2QT8xOZ+Q@mail.gmail.com%3E>

we think that we can solve this by doing the increments locally on the 
region. My question, is what happens if the row value specified does not 
land in the current region. We can obviously do our best to make sure 
that it does, but is there any way to be absolutely sure that it is? 
This is supposing we use incrementColumnValue() out of the HRegion class 
( 
http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/regionserver/HRegion.html#incrementColumnValue(byte[],

byte[], byte[], long, boolean) 
<http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/regionserver/HRegion.html#incrementColumnValue%28byte[],%20byte[],%20byte[],%20long,%20boolean%29>)

Here is the method signature for simplicity

public long*incrementColumnValue*(byte[] row,
                                  byte[] family,
                                  byte[] qualifier,
                                  long amount,
                                  boolean writeToWAL)

If we specify a row, there seems to be no guarantee that row will be 
confined to the region the coprocessor is on.

If the call does force the increment to be on the same region, what will 
happen if a later call ends up on another region but with the same name.

Contrived Example

Insert rowkey "California-12345" triggers a coprocessor to call 
incrementColumnValue() with a rowkey of "California-total"  all on Region 1.

This would likely be on an insert on the same region. But as the table 
grows, this secondary insert could end up on another region. If it is 
confined, then suppose we later insert "California-95424" which still 
triggers a call to incrementColumnValue() with a rowkey of 
"California-total" all on Region 2.

Are we now left with two rowkeys of "California-total"? One on each 
region server? If so, what happens if these two regions are compacted 
into one?

Hopefully this all makes sense. We are on Hbase 0.94.10. If we are going 
about this all wrong, that could be the issue as well :)

Thanks.

  -John Weatherford

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message