hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Rodionov <vrodio...@carrieriq.com>
Subject RE: Random I/O performance
Date Wed, 26 Oct 2011 21:50:37 GMT


>> Are you hitting cache at all?
>
> Its totally random, due to the proposed key design which favored fast inserts. Keys are
randomized
> values, that is why there is no data locality in row look ups. Effect of the cache (LruBlockCache?)
is negligible
> in this case.
>

>>So a different schema would get cache into the mix?

You can/t change schema while system is in production


>>Its going to keep growing without bound?


No, we keep data for XX days than purge stale data from the table.


My question was: what else besides obvious -run all in parallel - can help to improve random
I/O? 

1. Will BLOOM filter help to optimize HBase Read path?
2. We use compression already.
3. Block size - does it really matter much?
4. Off heap block cache? Its in 92 trunk? Have anybody performed real performance tests on
Off heap cache?

We could easily allocate 10-15 GB per node thus effectively caching hot data in other tables
(not in the fact table)

Off heap cache. What is max size of off heap cache we could try?
 My major concerns are: 

- memory allocators are pretty hard to debug and get them working right.
- memory fragmentation? 
- It still relies on on- heap Java data structures to perform eviction- which can degrade
performance in case of a large caches.

Mime
View raw message