hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dalia Sobhy <dalia.mohso...@hotmail.com>
Subject Hbase Question
Date Sun, 23 Dec 2012 23:26:35 GMT

Dear all,

I have 50,000 row with diagnosis qualifier = "cardiac", and another 50,000 rows with "renal".

When I type this in Hbase shell,

import org.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.SingleColumnValueFilter
import org.apache.hadoop.hbase.filter.SubstringComparator
import org.apache.hadoop.hbase.util.Bytes

scan 'patient', { COLUMNS => "info:diagnosis", FILTER =>
    SingleColumnValueFilter.new(Bytes.toBytes('info'),
         Bytes.toBytes('diagnosis'),
         CompareFilter::CompareOp.valueOf('EQUAL'),
         SubstringComparator.new('cardiac'))}

Output = 50,000 row

import org.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.SingleColumnValueFilter
import org.apache.hadoop.hbase.filter.SubstringComparator
import org.apache.hadoop.hbase.util.Bytes

count 'patient', { COLUMNS => "info:diagnosis", FILTER =>
    SingleColumnValueFilter.new(Bytes.toBytes('info'),
         Bytes.toBytes('diagnosis'),
         CompareFilter::CompareOp.valueOf('EQUAL'),
         SubstringComparator.new('cardiac'))}
Output = 100,000 row

Even though I tried it using Hbase Java API, Aggregation Client Instance, and I enabled the
Coprocessor aggregation for the table.
rowCount = aggregationClient.rowCount(TABLE_NAME, null, scan)

Also when measuring the improved performance on case of adding more nodes the operation takes
the same time.

So any advice please?

I have been throughout all this mess from a couple of weeks

Thanks,


 
 		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message