hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "YiFeng Jiang (JIRA)" <j...@apache.org>
Subject [jira] Created: (HBASE-3477) Filter for deprecated mapred APIs doesn't work when the table has few rows
Date Tue, 25 Jan 2011 10:38:43 GMT
Filter for deprecated mapred APIs doesn't work when the table has few rows
--------------------------------------------------------------------------

                 Key: HBASE-3477
                 URL: https://issues.apache.org/jira/browse/HBASE-3477
             Project: HBase
          Issue Type: Bug
          Components: filters
    Affects Versions: 0.90.0
         Environment: Linux (Debian), master 1, slaves 2
            Reporter: YiFeng Jiang


It seems that the filters will not be invoke when there are only a few data in the table.

I added some logs to the org.apache.hadoop.hbase.filte. PrefixFilter, and has a MyInputFormat
extends hbase.mapred.TableInputFormat, the deprecated mapred APIs.

The log added to PrefixFilter
{noformat} 
  public boolean filterRowKey(byte[] buffer, int offset, int length) {
    log.info("TODO: filterRowKey invoked");
    if (buffer == null || this.prefix == null) {
        log.info("TODO: #1 of filter");
      return true;
    }
    if (length < prefix.length) {
   ...
  }
{noformat} 

This is the code in my InputFormat's configure method.
{noformat} 
byte[] prefix = Bytes.toBytes("001");
Filter filter = new PrefixFilter(prefix);
setRowFilter(filter);
{noformat} 

And the job setup code.
{noformat} 
job.setInputFormat(MyInputFormat.class);
FileInputFormat.addInputPaths(job, "my_table_in_hbase");
job.set(TableInputFormat.COLUMN_LIST, "data:");
{noformat} 

When I put lots of data (> 500,000) in the table, the filter works well, but when I put
only a few data (<100) in the table, it seems that the filter will not be invoked,  and
the log in the filter has no output either.

This is the log output when lots of data in the table
{noformat} 
2011-01-25 16:43:59,568 INFO org.apache.hadoop.hbase.filter.PrefixFilter: TODO: default constructor
2011-01-25 16:44:01,728 INFO org.apache.hadoop.hbase.filter.PrefixFilter: TODO: filterRowKey
invoked
2011-01-25 16:44:01,728 INFO org.apache.hadoop.hbase.filter.PrefixFilter: TODO: #3 of filter
2011-01-25 16:44:01,728 INFO org.apache.hadoop.hbase.filter.PrefixFilter: TODO: filterAllRemaining
invoked
2011-01-25 16:44:01,729 INFO org.apache.hadoop.hbase.filter.PrefixFilter: TODO: filterAllRemaining
invoked
2011-01-25 16:44:01,729 INFO org.apache.hadoop.hbase.filter.PrefixFilter: TODO: filterAllRemaining
invoked
{noformat} 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message