hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Połaczański <dpolaczan...@gmail.com>
Subject table schema - row with many column vs many rows
Date Thu, 26 Jan 2017 21:57:15 GMT
Hi,
in the work we were testing the following scenarios regarding scan
performance. We stored 2500 domain rows containing 20 attributes.And after
that read one random row with all attributes couple times

Scenario A
every single attribute stored in dedicated column. one hbase row with 20
columns.

Scenario B
every single attribute stored as a separate row under key like
RowKey:AttributeKey
so we have 20 rows for one domain row

As we know in HBase everything is stored as following entry
RowKey:ColumnKey:Value

Theoritically we have in HBase the same amount of entries (2500*20) for
both scenario, so there shouldn't be any difference in performance. But it
looks that scanning in scenario A is much more faster (something like 10
times).

Do you havemaybe idea why Scenario A is better?

Regards

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message