hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Panshul Whisper <ouchwhis...@gmail.com>
Subject Estimating disk space requirements
Date Fri, 18 Jan 2013 12:11:20 GMT

I was estimating how much disk space do I need for my cluster.

I have 24 million JSON documents approx. 5kb each
the Json is to be stored into HBASE with some identifying data in coloumns
and I also want to store the Json for later retrieval based on the Id data
as keys in Hbase.
I have my HDFS replication set to 3
each node has Hadoop and hbase and Ubuntu installed on it.. so approx 11 GB
is available for use on my 20 GB node.

I have no idea, if I have not enabled Hbase replication, is the HDFS
replication enough to keep the data safe and redundant.
How much total disk space I will need for the storage of the data.

Please help me estimate this.

Thank you so much.

Ouch Whisper

View raw message