hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From edward choi <mp2...@gmail.com>
Subject Adjusting column value size.
Date Tue, 04 Oct 2011 05:58:30 GMT

I have a question regarding the performance and column value size.
I need to store per row several million integers. ("Several million" is
important here)
I was wondering which method would be more beneficial performance wise.

1) Store each integer to a single column so that when a row is called,
several million columns will also be called. And the user would map each
column values to some kind of container (ex: vector, arrayList)
2) Store, for example, a thousand integers into a single column (by
concatenating them) so that when a row is called, only several thousand
columns will be called along. The user would have to split the column value
into 4 bytes and map the split integer to some kind of container (ex:
vector, arrayList)

I am curious which approach would be better. 1) would call several millions
of columns but no additional process is needed. 2) would call only several
thousands of columns but additional process is needed.
Any advice would be appreciated.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message