hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleg Ruchovets <oruchov...@gmail.com>
Subject Re: delete rows from hbase
Date Tue, 19 Jun 2012 16:17:36 GMT
Thank you all for the answers. I try to speed up my solution and user
map/reduce over hbase

Here is the code:
I want to use Delete (map function to delete the row) and I pass the same
tableName  at TableMapReduceUtil.initTableMapperJob
and TableMapReduceUtil.initTableReducerJob.

Question: is it possible to pass Delete as I did in map function?




public class DeleteRowByCriteria {
    final static Logger LOG =
LoggerFactory.getLogger(DeleteRowByCriteria.class);
    public static class MyMapper extends
TableMapper<ImmutableBytesWritable, Delete> {

        public String account;
        public String lifeDate;

        @Override
        public void map(ImmutableBytesWritable row, Result value, Context
context) throws IOException, InterruptedException {
            context.write(row, new Delete(row.get()));
        }
    }
    public static void main(String[] args) throws ClassNotFoundException,
IOException, InterruptedException {

String tableName = args[0];
String filterCriteria = args[1];

        Configuration config = HBaseConfiguration.create();
        Job job = new Job(config, "DeleteRowByCriteria");
        job.setJarByClass(DeleteRowByCriteria.class);

        try {

            Filter campaignIdFilter = new
PrefixFilter(Bytes.toBytes(filterCriteria));
            Scan scan = new Scan();
            scan.setFilter(campaignIdFilter);
            scan.setCaching(500);
            scan.setCacheBlocks(false);


            TableMapReduceUtil.initTableMapperJob(
                    tableName,
                    scan,
                    MyMapper.class,
                    null,
                    null,
                    job);


            TableMapReduceUtil.initTableReducerJob(
                    tableName,
                    null,
                    job);
            job.setNumReduceTasks(0);

            boolean b = job.waitForCompletion(true);
            if (!b) {
                throw new IOException("error with job!");
            }

        }catch (Exception e) {
            LOG.error(e.getMessage(), e);
        }
    }
}



On Tue, Jun 19, 2012 at 9:26 AM, Kevin O'dell <kevin.odell@cloudera.com>wrote:

> Oleg,
>
>  Here is some code that we used for deleting all rows with user name
> foo.  It should be fairly portable to your situation:
>
> import java.io.IOException;
>
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.hbase.HBaseConfiguration;
> import org.apache.hadoop.hbase.client.HTable;
> import org.apache.hadoop.hbase.client.Result;
> import org.apache.hadoop.hbase.client.ResultScanner;
> import org.apache.hadoop.hbase.client.Scan;
> import org.apache.hadoop.hbase.util.Bytes;
>
> public class HBaseDelete {
> public static void main(String[] args){
> Configuration conf = HbaseConfiguration.create();
> Htable t = new HTable("t");
>
> String user = "foo";
>
> byte[] startRow = Bytes.toBytes(user);
> byte[] stopRow = Bytes.toBytes(user);
> stopRow[stopRow.length - 1]++; //'fop'
> Scan scan = new Scan(start Row, stopRow);
> ResultScanner sc = t.getScanner(scan);
> for(Result r : sc) {
>  t.delete(new Delete(r.getRow()));
> }
> }
> }
> /**
>  * Start row: foo
>  * HBase begins matching this byte, one after another.
>  * End row: foo
>  * HBase stops matching at first match, cause start == stop.
>  * End Row: fo[p] (p being 0 +1)
>  * HBase stops matching at something not "foo"
>  */
>
>
> On Tue, Jun 19, 2012 at 6:46 AM, Mohammad Tariq <dontariq@gmail.com>
> wrote:
> > you can use Hbase RowFilter to do that.
> >
> > Regards,
> >     Mohammad Tariq
> >
> >
> > On Tue, Jun 19, 2012 at 1:13 PM, shashwat shriparv
> > <dwivedishashwat@gmail.com> wrote:
> >> Try to impliment something like this
> >>
> >> Class RegexStringComparator
> >>
> >>
> >>
> >> On Tue, Jun 19, 2012 at 5:06 AM, Amitanand Aiyer <amitanand.s@fb.com>
> wrote:
> >>
> >>> You could set up a scan with the criteria you want (start row, end row,
> >>> keyonlyfilter etc), and do a delete for
> >>> The rows you get.
> >>>
> >>> On 6/18/12 3:08 PM, "Oleg Ruchovets" <oruchovets@gmail.com> wrote:
> >>>
> >>> >Hi ,
> >>> >I need to delete rows from hbase table by criteria.
> >>> >For example I need to delete all rows started with "12345".
> >>> >I didn't find a way to set a row prefix for delete operation.
> >>> >What is the best way ( practice ) to delete  rows by criteria from
> hbase
> >>> >table?
> >>> >
> >>> >Thanks in advance.
> >>> >Oleg.
> >>>
> >>>
> >>
> >>
> >> --
> >>
> >>
> >> ∞
> >> Shashwat Shriparv
>
>
>
> --
> Kevin O'Dell
> Customer Operations Engineer, Cloudera
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message