hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Charles <eric.char...@u-mangate.com>
Subject Re: versions stored in a cell
Date Tue, 05 Apr 2011 07:43:28 GMT
Hi Ted,

Tks for pointing that HBASE-3488 is about CellCounter.
This will bring better visibility on the stored cells.

Vishal initial question was about having numerous version for a same 
rowid/key.

I know datamodel design depends on usecase, but on a technical 
point-of-vue (read/write performance...) is there anything against 
having numerous (thousands, millions,...) versions of a same key ?

Tks,
- Eric


On 4/04/2011 18:38, Ted Yu wrote:
> For 2, HBASE-3488  is for Cell Counter.
> In Vishal's case, 3 years of data is stored for given row key. Issuing 'get'
> command would not help much.
>
> TIMERANGE support has been added in HBASE-3729
>
> Cheers
>
> On Sun, Apr 3, 2011 at 11:40 PM, Eric Charles<eric.charles@u-mangate.com>wrote:
>
>> 1.- On my side, I could imagine to use the versions to store the history of
>> a key (without the need to add extra index table). Really depends on
>> requirement and datamodel, I think but many versions can sometimes make
>> sense.
>>
>> 2.- HBASE-3488 is related to the hadoop rowcounter job. To get versions by
>> code, you can use the setVersion/setMaxVersion/setTimeRange methods of the
>> Get and Scan objects. Via the shell, you can use  "get 't1', 'r1', {COLUMN
>> =>  'c1', TIMESTAMP =>  ts1, VERSIONS =>  4}" (not sure oif it's possible
with
>> TIMERANGE vi the shell?)
>>
>> Tks,
>> - Eric
>>
>>
>>
>> On 3/04/2011 22:12, Ted Yu wrote:
>>
>>> For 1, please give some background to justify the high number of versions.
>>>
>>> For 2, take a look at HBASE-3488
>>>
>>> On Sun, Apr 3, 2011 at 12:49 PM, Vishal Kapoor
>>> <vishal.kapoor.in@gmail.com>wrote:
>>>
>>>   two questions,
>>>>
>>>> 1) if I give number of versions for a family as 365*3 is it a bad
>>>> design? how many versions are a good practice? if I have two many
>>>> versions will that be a single seek when I get the row Id? if yes,
>>>> will it take longer to store data? pros and cons?
>>>>
>>>> 2) how do I get the number of versions actually stored in a cell ( not
>>>> the max versions it is configured to store)
>>>>
>>>> thanks,
>>>> Vishal
>>>>
>>>>
>>>
>

Mime
View raw message