hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Luke Lu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8468) Umbrella of enhancements to support different failure and locality topologies
Date Tue, 05 Jun 2012 21:35:23 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13289747#comment-13289747
] 

Luke Lu commented on HADOOP-8468:
---------------------------------

Actually, the two approaches are orthogonal. Avoiding placing more than one data node of the
same logical cluster on the same physical host will increase reliability even if the new topology
algorithm is in place. 

VM placement is only NP hard if instance configuration is arbitrary and that you require absolute
optimal placement. It's easier if the number of instance types is limited a la AWS. I suspect
that greedy algorithms exist to approximate the optimal replacement. We don't need millisecond
response time for such placement algorithm either, which is only done once at the logical
cluster deploy time and when there are physical host failures.

It's definitely easier to do such placement when number of nodes of a logical cluster is much
smaller than the number of physical hosts, which is the case for AWS and SmartCloud.
                
> Umbrella of enhancements to support different failure and locality topologies
> -----------------------------------------------------------------------------
>
>                 Key: HADOOP-8468
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8468
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: ha, io
>    Affects Versions: 1.0.0, 2.0.0-alpha
>            Reporter: Junping Du
>            Assignee: Junping Du
>            Priority: Critical
>         Attachments: HADOOP-8468-total-v3.patch, HADOOP-8468-total.patch, Proposal for
enchanced failure and locality topologies.pdf
>
>
> The current hadoop network topology (described in some previous issues like: Hadoop-692)
works well in classic three-tiers network when it comes out. However, it does not take into
account other failure models or changes in the infrastructure that can affect network bandwidth
efficiency like: virtualization. 
> Virtualized platform has following genes that shouldn't been ignored by hadoop topology
in scheduling tasks, placing replica, do balancing or fetching block for reading: 
> 1. VMs on the same physical host are affected by the same hardware failure. In order
to match the reliability of a physical deployment, replication of data across two virtual
machines on the same host should be avoided.
> 2. The network between VMs on the same physical host has higher throughput and lower
latency and does not consume any physical switch bandwidth.
> Thus, we propose to make hadoop network topology extend-able and introduce a new level
in the hierarchical topology, a node group level, which maps well onto an infrastructure that
is based on a virtualized environment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message