hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Meil <doug.m...@explorysmedical.com>
Subject Re: Disk Seeks and Column families
Date Sat, 21 Jan 2012 13:52:08 GMT

Also, for #2 Hbase supports large-scale aggregation through MapReduce.

On 1/21/12 7:47 AM, "Andrey Stepachev" <octo47@gmail.com> wrote:

>2012/1/21 Praveen Sripati <praveensripati@gmail.com>:
>> Hi,
>> 1) According to the this url (1), HBase performs well for two or three
>> column families. Why is it so?
>Frist, each column family stored in separate location, so, as stated in
>'6.2.1. Cardinality of ColumnFamilies', such schema design can lead
>to many small pieces for small column family and aggregate should
>perform slowly.
>Second, if region split, all column families will split too,
>in case of large  number of them whis can be inefficient.
>Third, related to number of memstores. Each column family
>has it's own memstore, so it is more likely to hit forced flush
>and bloсked writes.
>> 2) Dump of a HFile, looks like below. The contents of a row stay
>> like a regular row-oriented database. If the column family has 100
>> family qualifiers and is dense then the data for a particular column
>> qualifier is spread wide. If I want to do an aggregation on a particular
>> column identifier, the disk seeks doesn't seems to be much better than a
>> regular row-oriented database.
>You don't need seeks for each column, hbase reads blocks and filter
>out uneeded data.
>And most pefromance gained from collocated keys and compression.
>BTW, hbase is not so good in case of wide tables, hbase prefers tall
>> Please correct me if I am wrong.
>> K: row-550/colfam1:50/1309813948188/Put/vlen=2 V: 50
>> K: row-550/colfam1:50/1309812287166/Put/vlen=2 V: 50
>> K: row-551/colfam1:51/1309813948222/Put/vlen=2 V: 51
>> K: row-551/colfam1:51/1309812287200/Put/vlen=2 V: 51
>> K: row-552/colfam1:52/1309813948256/Put/vlen=2 V: 52
>> (1) - http://hbase.apache.org/book/number.of.cfs.html
>> Thanks,
>> Praveen

View raw message