hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jan Kunigk (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8468) Umbrella of enhancements to support different failure and locality topologies
Date Mon, 18 Mar 2013 16:10:20 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605252#comment-13605252
] 

Jan Kunigk commented on HADOOP-8468:
------------------------------------

Junping,

Referring to one of your earlier comments on 08/Jun/12:
> For 2. It's right that VMs on the same host will not share storage directly
> but could do so (with getting virtual disks) through Hypervisor FS (Like VMFS in VMware
vSphere) layer.
> Another way (should recommend for hadoop case) is to go through RDM (Raw Disk Mapping)
configuration
> in hypervisor that each VM can get some dedicated physical disks.

Are you envisioning a usage model where each virtual cluster has its own distributed filesystem
?
When I use virtualization I would most likely suspend my virtual clusters from time to time...
Can you comment on what would happen to the HDFS data in this case, would one have to persist
it in a different storage tier?
                
> Umbrella of enhancements to support different failure and locality topologies
> -----------------------------------------------------------------------------
>
>                 Key: HADOOP-8468
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8468
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ha, io
>    Affects Versions: 1.0.0, 2.0.0-alpha
>            Reporter: Junping Du
>            Assignee: Junping Du
>         Attachments: HADOOP-8468-total.patch, HADOOP-8468-total-v3.patch, HVE_Hadoop
World Meetup 2012.pptx, HVE User Guide on branch-1(draft ).pdf, Proposal for enchanced failure
and locality topologies.pdf, Proposal for enchanced failure and locality topologies (revised-1.0).pdf
>
>
> The current hadoop network topology (described in some previous issues like: Hadoop-692)
works well in classic three-tiers network when it comes out. However, it does not take into
account other failure models or changes in the infrastructure that can affect network bandwidth
efficiency like: virtualization. 
> Virtualized platform has following genes that shouldn't been ignored by hadoop topology
in scheduling tasks, placing replica, do balancing or fetching block for reading: 
> 1. VMs on the same physical host are affected by the same hardware failure. In order
to match the reliability of a physical deployment, replication of data across two virtual
machines on the same host should be avoided.
> 2. The network between VMs on the same physical host has higher throughput and lower
latency and does not consume any physical switch bandwidth.
> Thus, we propose to make hadoop network topology extend-able and introduce a new level
in the hierarchical topology, a node group level, which maps well onto an infrastructure that
is based on a virtualized environment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message