hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Pivovarov <apivova...@gmail.com>
Subject Re: HDFS disk space requirement
Date Fri, 11 Jan 2013 05:49:56 GMT
finish elementary school first. (plus, minus operations at least)


On Thu, Jan 10, 2013 at 7:23 PM, Panshul Whisper <ouchwhisper@gmail.com>wrote:

> Thank you for the response.
>
> Actually it is not a single file, I have JSON files that amount to 115 GB,
> these JSON files need to be processed and loaded into a Hbase data tables
> on the same cluster for later processing. Not considering the disk space
> required for the Hbase storage, If I reduce the replication to 3, how much
> more HDFS space will I require?
>
> Thank you,
>
>
> On Fri, Jan 11, 2013 at 4:16 AM, Ravi Mutyala <ravi@hortonworks.com>wrote:
>
>> If the file is a txt file, you could get a good compression ratio.
>> Changing the replication to 3 and the file will fit. But not sure what your
>> usecase is what you want to achieve by putting this data there. Any
>> transformation on this data and you would need more space to save the
>> transformed data.
>>
>> If you have 5 nodes and they are not virtual machines, you should
>> consider adding more harddisks to your cluster.
>>
>>
>> On Thu, Jan 10, 2013 at 9:02 PM, Panshul Whisper <ouchwhisper@gmail.com>wrote:
>>
>>> Hello,
>>>
>>> I have a hadoop cluster of 5 nodes with a total of available HDFS space
>>> 130 GB with replication set to 5.
>>> I have a file of 115 GB, which needs to be copied to the HDFS and
>>> processed.
>>> Do I need to have anymore HDFS space for performing all processing
>>> without running into any problems? or is this space sufficient?
>>>
>>> --
>>> Regards,
>>> Ouch Whisper
>>> 010101010101
>>>
>>
>>
>
>
> --
> Regards,
> Ouch Whisper
> 010101010101
>

Mime
View raw message