hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: HBase random access in HDFS and block indices
Date Fri, 29 Oct 2010 17:01:24 GMT
On Fri, Oct 29, 2010 at 6:41 AM, Sean Bigdatafun
<sean.bigdatafun@gmail.com> wrote:
> I have the same doubt here. Let's say I have a totally random read pattern
> (uniformly distributed).
>
> Now let's assume my total data size stored in HBase is 100TB on 10
> machines(not a big deal considering nowaday's disks), and the total size of
> my RS' memory is 10 * 6G = 60 GB. That translate into a 60/100*1000 = 0.06%
> cache hit probablity. Under random read pattern, each read is bound to
> experience the "open-> read index -> .... -> read datablock" sequence, which
> would be expensive.
>
> Any comment?
>

If totally random, as per Alvin's suggestion, yes, just turn off block
caching since it is doing you no good.

But totally random is unusual in practise, no?

St.Ack

Mime
View raw message