hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Many smaller tables vs one large table
Date Wed, 09 Mar 2011 23:13:40 GMT
I guess it could be a good idea... do you need to be able to scan for
data that's contained in more than one day?


On Wed, Mar 9, 2011 at 2:08 PM, Peter Haidinyak <phaidinyak@local.com> wrote:
> Hi all,
>    Right now I am aggregating our log data and populating tables based on how we want
to query the data later. Currently I have eleven different aggregation tables and the date
is part of the Row key. Since we usually slice our data by day I was wondering if it would
be better to create aggregation table by date. I would no longer have to use the date as part
of the stop/end row keys in a scan and it would be easier to prune old data. I would also
guess there would be less contention on tables between the process that populates the table
and the processes that query the table. One of the only problems I see, with my limited knowledge
about HBase, is the tables will end up being rather small and would most likely end up on
one region server.
>        Long story short, is this a good idea?
> Thanks
> -Pete

View raw message