hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Slow row deletion performance in comparison to insertion
Date Wed, 27 Jun 2012 21:15:36 GMT
bq. if I batch the deletes into one big one at the end (rather than while
I'm scanning)
That's what you should do.

See also HBASE-6284 where an optimization, HRegion#doMiniBatchDelete(), is
under development.

On Wed, Jun 27, 2012 at 2:03 PM, Jeff Whiting <jeffw@qualtrics.com> wrote:

> I'm struggling to understand why my deletes are taking longer than my
> inserts.  My understanding is that a delete is just an insertion of a
> tombstone.  And I'm deleting the entire row.
>
> I do a simple loop (pseudo code) and insert the 100 byte rows:
>
> for (int i=0; i < 50000; i++)
> {
>    puts.append(new Put(rowkey[i], oneHundredBytes[i]));
>
>    if (puts.size() % 1000 == 0)
>    {
>        Benchmark.start();
>        table.batch(puts);
>        Benchmark.stop();
>    }
> }
>
>
> The above takes about 8282ms total.
>
> However the delete takes more than twice as long:
>
> Iterator it = table.getScannerScan(rowkey[0]**,
> rowkey[50000-1]).iterator();
> while(it.hasNext())
> {
>    r = it.next();
>    deletes.append(new Delete(r.getRow()));
>    if (deletes.size() % 1000 == 0)
>    {
>        Benchmark.start();
>        table.batch(deletes);
>        Benchmark.stop();
>    }
> }
>
> The above takes 17369ms total.
>
> I'm only benchmarking the deletion time and not the scan time.
> Additionally if I batch the deletes into one big one at the end (rather
> than while I'm scanning) it takes about the same amount of time. I am
> deleting the entire row so I wouldn't think it would be doing a read before
> the delete (http://mail-archives.apache.**org/mod_mbox/hbase-user/**
> 201206.mbox/%**3CE83D30E8F408F94A96F992785FC2**9D82063395D6@s2k3mntaexc1.*
> *mentacapital.local%3E<http://mail-archives.apache.org/mod_mbox/hbase-user/201206.mbox/%3CE83D30E8F408F94A96F992785FC29D82063395D6@s2k3mntaexc1.mentacapital.local%3E>
> ).
>
> Any thoughts on why it is slower and how I can speed it up?
>
> Thanks,
> ~Jeff
>
> --
> Jeff Whiting
> Qualtrics Senior Software Engineer
> jeffw@qualtrics.com
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message