hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: Hadoop/HBase hardware requirement
Date Mon, 22 Nov 2010 00:44:17 GMT
On Sun, Nov 21, 2010 at 5:53 AM, Oleg Ruchovets <oruchovets@gmail.com>wrote:

> Hi all,
> After testing HBase for few months with very light configurations  (5
> machines, 2 TB disk, 8 GB RAM), we are now planing for production.
> Our Load -
> 1) 50GB log files to process per day by Map/Reduce jobs.
> 2)  Insert 4-5GB to 3 tables in hbase.
>

Are these insertions the output of the MR jobs?

If so, I would strongly recommend the bulk load functionality. It is
somewhere between 10x and 100x more efficient than direct API usage.


> 3) Run 10-20 scans per day (scanning about 20 regions in a table).
> All this should run in parallel.
> Our current configuration can't cope with this load and we are having many
> stability issues.
>
> This is what we have in mind :
> 1. Master machine - 32 GB, 4 TB, Two quad core CPUs.
> 2. Name node - 16 GB, 2TB, Two quad core CPUs.
> we plan to have up to 20 name servers (starting with 5).
>
> We already read
>
> http://www.cloudera.com/blog/2010/03/clouderas-support-team-shares-some-basic-hardware-recommendations/
> .
>
> We would appreciate your feedback on our proposed configuration.
>
>
> Regards Oleg & Lior
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message