hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Meil <doug.m...@explorysmedical.com>
Subject Re: hbase delete operation is very slow
Date Tue, 21 Feb 2012 22:45:37 GMT

Hi there-

You probably want to see this...

http://hbase.apache.org/book.html#perf.deleting

.. that particular method doesn't use the write-buffer and is submitting
deletes one-by-one to the RS's.




On 2/21/12 3:52 PM, "Haijia Zhou" <leonster@gmail.com> wrote:

>Hi, All
>I'm new to this email list and hope I can get help from here.
>My task is to come up with a M/R job in hbase to scan the whole table,
>find
>out some data and delete them (delete the whole row), this job will be
>executed on a daily basis.
>Basically I have mapper class whose map() looks like follows:
>public void map(ImmutableBytesWritable row, Result columns,
>                Context context)
>{
>  ... do some check
>  byte[] row = ...
>  if(needs to delete user){
>       Delete delete = new Delete(row);
>       table.delete(delete)
>   }
>
>There's no reducer needed for this task.
>
>Now, we are observing that this job takes a long time to finish (around
>3-4
>hours) for 49,565,000 delete operations and 191,838,114 total records
>across 7 region servers
>We know that a full table scan on the corresponding column/column family
>takes around 40 minutes, so all the rest time were for the delete
>operation.
>
>I wonder if there's anyway or tool to profile the hadoop M/R job ?
>
>Thanks
>
>Haijia



Mime
View raw message