hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lin Ma <lin...@gmail.com>
Subject asking for advice to improve performance
Date Wed, 22 Aug 2012 16:57:51 GMT
Hi HBase masters,

I am reading HBase in reducer. In reducer, the key is student ID, value is
book ID (key + value, means the book read by the student at one time).
HBase is using book ID as row-key. In reducer, I query HBase by book ID and
fetching information like author, price, and other information like
abstract of the book. The HBase contains large volume of information,
Million level of records (books). My concern is reading HBase will slow
down reducer since remote I/O (reducer may read data belongs to a remote
region server) is used to fetch data from HBase, and I am also not
confident about HBase cache hit rate, since access pattern for book is
random (student may read any book).

Any advice for improving performance is appreciated, including change HBase
schema. Thanks.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message