hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From daemeon reiydelle <daeme...@gmail.com>
Subject Re: Datanode disk configuration
Date Wed, 12 Nov 2014 17:24:27 GMT
Yes. That is why you should consider striping across raid 0 (JBOD)









*.......“The race is not to the swift,nor the battle to the strong,but to
those who can see it coming and jump aside.” - Hunter ThompsonDaemeon C.M.
ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*

On Wed, Nov 12, 2014 at 9:09 AM, Brian C. Huffman <
bhuffman@etinternational.com> wrote:

>  That will make the volume balancing easy, but couldn't it hurt
> performance?  My understanding is that there would be three write threads
> pointing to the 3TB disk and 2 threads pointing to the 2TB disk.
>
> Would it be better from a performance perspective to include the 500GB
> drive in the configuration and just use the
> AvailableSpaceVolumeChoosingPolicy from the beginning?
>
> Thanks,
> Brian
>
> On 11/12/2014 11:47 AM, Leonid Fedotov wrote:
>
> Create 1 Tb partitions for 2 and 3 TB drives and you will have 5 mount
> points same size.
>
>
>   *Thank you!*
>
>
>  *Sincerely,*
>
> *Leonid Fedotov*
>
> Systems Architect - Professional Services
>
> lfedotov@hortonworks.com
>
> office: +1 855 846 7866 ext 292
>
> mobile: +1 650 430 1673
>
> On Wed, Nov 12, 2014 at 8:36 AM, Brian C. Huffman <
> bhuffman@etinternational.com> wrote:
>
>>  All,
>>
>> I'm setting up a 4-node Hadoop 2.5.1 cluster.  Each node has the
>> following drives:
>> 1 - 500GB drive (OS disk)
>> 1 - 500GB drive
>> 1 - 2 TB drive
>> 1 - 3 TB drive.
>>
>> In past experience I've had lots of issues with non-uniform drive sizes
>> for HDFS, but unfortunately it wasn't an option to get all 3TB or 2TB
>> drives for this cluster.
>>
>> My thought is to set up the 2TB and 3TB drives as HDFS and the 500GB
>> drive as intermediate data.  Most our of jobs don't make large use of
>> intermediate data, but at least this way, I get a good amount of space
>> (2TB) per node before I run into issues.  Then I may end up using the AvailableSpaceVolumeChoosingPolicy
>> to help with balancing the blocks.
>>
>> If necessary I could put intermediate data on one of the OS partitions
>> (/home).  But this doesn't seem ideal.
>>
>> Anybody have any recommendations regarding the optimal use of storage in
>> this scenario?
>>
>> Thanks,
>> Brian
>>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
>
>

Mime
View raw message