hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Tuttle" <ted.tut...@mentacapital.com>
Subject RE: RS unresponsive after series of deletes
Date Thu, 21 Jun 2012 14:02:31 GMT
Good hint, Ted

By calling Delete.deleteColumn(family, qual, ts) instead of deleteColumn
w/o timestamp, the time to delete row keys is reduced by 95%.

I am going to experiment w/ limited batches of Deletes, too.

Thanks everyone for help on this one.


-----Original Message-----
From: Ted Yu [mailto:yuzhihong@gmail.com] 
Sent: Wednesday, June 20, 2012 10:13 PM
To: user@hbase.apache.org
Subject: Re: RS unresponsive after series of deletes

As I mentioned earlier, prepareDeleteTimestamps() performs one get
operation per column qualifier:
          get.addColumn(family, qual);

          List<KeyValue> result = get(get, false);
This is too costly in your case.
I think you can group some configurable number of qualifiers in each get
and perform classification on result.
This way we can reduce the number of times
HRegion$RegionScannerImpl.next()
is called.

Cheers

On Wed, Jun 20, 2012 at 9:54 PM, Ted Tuttle
<ted.tuttle@mentacapital.com>wrote:

> > Do your 100s of thousands cell deletes overlap (in terms of column
> family)
> > across rows ?
>
> Our schema contains only one column family per table. So, each Delete
> contains cells from a single column family.  I hope this answers your
> question.

Mime
View raw message