hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <lhofha...@yahoo.com>
Subject Re: RS unresponsive after series of deletes
Date Sat, 23 Jun 2012 02:35:12 GMT
Sorry for chiming in late.

Are you sure you want to use Delete.deleteColumn and not Delete.deleteColumns (note the plural
form).
deleteColumn marks a single version of a column (of a CF of a Row) for deletion
deleteColumns marks all versions of a column as deleted (unless you specify a timestamp).

deleteColumns is what you want in most cases unless you carefully have to control individual
version of a specific column in a specific row.

-- Lars



________________________________
 From: Ted Tuttle <ted.tuttle@mentacapital.com>
To: user@hbase.apache.org 
Cc: Development <Development@mentacapital.com> 
Sent: Thursday, June 21, 2012 7:02 AM
Subject: RE: RS unresponsive after series of deletes
 
Good hint, Ted

By calling Delete.deleteColumn(family, qual, ts) instead of deleteColumn
w/o timestamp, the time to delete row keys is reduced by 95%.

I am going to experiment w/ limited batches of Deletes, too.

Thanks everyone for help on this one.


-----Original Message-----
From: Ted Yu [mailto:yuzhihong@gmail.com] 
Sent: Wednesday, June 20, 2012 10:13 PM
To: user@hbase.apache.org
Subject: Re: RS unresponsive after series of deletes

As I mentioned earlier, prepareDeleteTimestamps() performs one get
operation per column qualifier:
          get.addColumn(family, qual);

          List<KeyValue> result = get(get, false);
This is too costly in your case.
I think you can group some configurable number of qualifiers in each get
and perform classification on result.
This way we can reduce the number of times
HRegion$RegionScannerImpl.next()
is called.

Cheers

On Wed, Jun 20, 2012 at 9:54 PM, Ted Tuttle
<ted.tuttle@mentacapital.com>wrote:

> > Do your 100s of thousands cell deletes overlap (in terms of column
> family)
> > across rows ?
>
> Our schema contains only one column family per table. So, each Delete
> contains cells from a single column family.  I hope this answers your
> question.
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message