hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark <static.void....@gmail.com>
Subject Re: Multiple tables vs big fat table
Date Mon, 21 Nov 2011 01:18:21 GMT
Thanks for the info.

On 11/20/11 11:30 AM, lars hofhansl wrote:
> There are many considerations here, but one is that separate tables provide a completely
separate namespace.
> If you use one table design of the key space is more involved as you need to separate
the namespace with key prefixes.
>
>
> So if you never have to access data from separate "key space" in a single scan, then
go for multiple tables.
>
> On the other hand, one big table will probably distribute better over the regionserver
and lead to fewer regions over all.
>
> So it depends on how many tables you envision. 10 or 20 or even 100 or so it probably
OK. 1000 tables or more will lead to very
> many regions and hence overhead at the regionservers.
>
>
>
> ________________________________
>   From: Mark<static.void.dev@gmail.com>
> To: user@hbase.apache.org
> Sent: Sunday, November 20, 2011 9:54 AM
> Subject: Re: Multiple tables vs big fat table
>
> I'm more interested in how and why it would depend rather than the
> actual answer.
>
> In evenly distributed systems you should do x/y because ..... If your
> data is not evenly distributed then you should...
>
> Thanks
>
>
> On 11/20/11 12:57 AM, Michel Segel wrote:
>> Mark,
>> Simple answer ... it depends... ;-)
>>
>> Longer answer...
>> What's your use case? What's your access pattern? Is the type of data, in this case
evenly distributed in terms of size?
>>
>>
>>
>> Sent from a remote device. Please excuse any typos...
>>
>> Mike Segel
>>
>> On Nov 18, 2011, at 3:29 PM, Mark<static.void.dev@gmail.com>   wrote:
>>
>>> Is it better to have many smaller tables are one larger table? For example if
we wanted to store user action logs we could do either of the following:
>>>
>>> Multiple tables:
>>> - SearchLog
>>> - PageViewLog
>>> - LoginLog
>>>
>>> or
>>>
>>> One table:
>>>     - ActionLog where the key could be a concatenation of the action type ie
(search, pageview, login)
>>>
>>> Any ideas? Are there any performance considerations on having multiple smaller
tables?
>>>
>>> Thanks
>>>
>>>

Mime
View raw message