hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Seraph Imalia <ser...@eisp.co.za>
Subject Re: Improving HBase scanner
Date Wed, 05 May 2010 15:01:06 GMT
Yeah, that is exactly why we are using GUID for the row key :)

Michelan is busy writing code to add secondary indexing - the table is  
about 200 Gigs big so it's gonna take a while to run, but it looks  
like the only option we have.

On 05 May 2010, at 10:29 AM, TuX RaceR wrote:

> Also be aware that using a time based key, you will probably create  
> 'hot spots', i.e. the nodes will get all the load one after the  
> other at writing time, and possibly at read time too, if you query  
> only recent data.
> But I do not see any way to avoid that, as you do need a scanner,
> cheers
> TuX
> TuX RaceR wrote:
>> Seraph Imalia wrote:
>>> Hi Ryan,
>>> Thanks for your response - I am also working on this project.
>>> I was hoping that hBase perhaps treated the time range differently  
>>> which would prevent a full table scan.  I suppose our only next  
>>> option is to implement indexing?
>> Yes I would say so except if a time-based key can naturally  
>> identify a record, or if you will always retrieve your records  
>> using time queries.
>> In that case you could create a key which is a concat of a  
>> timestamp and your old SQL uid,
>> cheers
>> TuX

View raw message