hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: RS unresponsive after series of deletes
Date Thu, 21 Jun 2012 04:33:26 GMT
Ted T:
Do your 100s of thousands cell deletes overlap (in terms of column family)
across rows ?

In HRegionServer:
  public <R> MultiResponse multi(MultiAction<R> multi) throws IOException {
...
      for (Action<R> a : actionsForRegion) {
        action = a.getAction();
...
          if (action instanceof Delete) {
            delete(regionName, (Delete) action);

I think if we group the deletes of actionsForRegion, we can utilize the
following:
  public int delete(final byte[] regionName, final List<Delete> deletes)

Inside HRegion, we should be able to reduce the number of times
HRegion$RegionScannerImpl.next() is called.

Cheers

On Wed, Jun 20, 2012 at 8:52 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> Looking at the stack trace, I found the following hot spot:
>
>    1.
>    org.apache.hadoop.hbase.regionserver.StoreFileScanner.realSeekDone(StoreFileScanner.java:340)
>    2.
>    org.apache.hadoop.hbase.regionserver.KeyValueHeap.pollRealKV(KeyValueHeap.java:331)
>    3.
>    org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:105)
>    4.
>    org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:406)
>    5.
>    org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:127)
>    6.
>    org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3354)
>    7.
>    org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3310)
>    8.
>    org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3327)
>    9.
>    org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4066)
>    10.
>    org.apache.hadoop.hbase.regionserver.HRegion.prepareDeleteTimestamps(HRegion.java:1710)
>    11.
>    org.apache.hadoop.hbase.regionserver.HRegion.internalDelete(HRegion.java:1753
>
> From HRegion:
>
>       for (KeyValue kv: kvs) {
>         //  Check if time is LATEST, change to time of most recent
> addition if so
>         //  This is expensive.
>         if (kv.isLatestTimestamp() && kv.isDeleteType()) {
> ...
>           List<KeyValue> result = get(get, false);
>
> We perform get() for each kv whose time is LATEST.
> This explains the unresponsiveness.
>
> FYI
>
>
> On Wed, Jun 20, 2012 at 5:07 PM, Ted Tuttle <ted.tuttle@mentacapital.com>wrote:
>
>> First off, J-D, thanks for helping me work through this.  You've
>> inspired some different angles and I think I've finally made it bleed in
>> a controlled way.
>>
>> > - That data you are deleting needs to be read when you scan, like I
>> > said earlier a delete is in fact an insert in HBase and this isn't
>> > cleared up until a major compaction happens.
>>
>> I manually compacted (via UI) the table that I deleted from.  The scan
>> times are still >10min.  When reading through each node's log, I see
>> some messages indicating the major compactions were going to be skipped.
>> Is it safe to say that hitting that 'Compact' button is just a
>> recommendation?  Is there an operation we can perform after a big delete
>> to guarantee that deletes get compacted away?
>>
>> > Do you have scanner caching turned on? Just to be sure set
>> > scan.setCaching(1) and see if it makes any difference.
>>
>> A bit confused here.  Under what conditions would you recommend setting
>> the scan caching to 1?  My read path doesn't know about whether a lot of
>> data was recently deleted so I can't disable it conditionally. I want
>> scan caching in general, I believe.
>>
>> > Are you saying that you have Delete objects on which you did
>> > deleteColumn() 1000x? If so, look no further there's your problem.
>>
>> I am calling deleteColumn() thousands of time per Delete object.
>>
>> I can delete a row w/ 20k keys in ~2 sec. If I issue 10 of these (they
>> appear to fired off asynchronously by the client), the unresponsive RS
>> behavior ensues.  Here is a stack dump from a RS that is running at >90%
>> utilization as it processes my deletes:
>>
>> http://pastebin.com/8y5x4xU7
>>
>> Some logs around this time:
>>
>> http://pastebin.com/UpPMbsmn
>>
>> So, my takeaway is the RS don't like being slammed w/ 100s of thousands
>> cell deletes.  I can be more measured about these deletes going forward.
>> That the RSs don't handle this more gracefully sounds like a bug. At a
>> minimum, there appears to be a nonlinear response. What do you think?
>>
>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message