ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ivan Veselovsky (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (IGNITE-3414) Hadoop: Optimize map-reduce job planning.
Date Wed, 06 Jul 2016 18:16:11 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-3414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15364789#comment-15364789
] 

Ivan Veselovsky commented on IGNITE-3414:
-----------------------------------------

org.apache.ignite.hadoop.mapreduce.IgniteHadoopWeightedMapReducePlanner , line 218 (marked
with ******* below): it seems, the collection ordering is lost there, would be better to use
a list or LinkedHashSet:
{code} 
    @Nullable private Collection<UUID> igfsAffinityNodesForSplit(HadoopInputSplit split)
throws IgniteCheckedException {
..............................
                            Map<NodeIdAndLength, UUID> res = new TreeMap<>();

                            for (Map.Entry<UUID, Long> idToLenEntry : idToLen.entrySet())
{
                                UUID id = idToLenEntry.getKey();

                                res.put(new NodeIdAndLength(id, idToLenEntry.getValue()),
id);
                            }

                            return new HashSet<>(res.values()); /// *********
                        }
                    }
                }
            }
        }

        return null;
    }
{code}

> Hadoop: Optimize map-reduce job planning.
> -----------------------------------------
>
>                 Key: IGNITE-3414
>                 URL: https://issues.apache.org/jira/browse/IGNITE-3414
>             Project: Ignite
>          Issue Type: Task
>          Components: hadoop
>    Affects Versions: 1.6
>            Reporter: Vladimir Ozerov
>            Assignee: Vladimir Ozerov
>            Priority: Critical
>             Fix For: 1.7
>
>
> Currently Hadoop module has inefficient map-reduce planning engine. In particular, it
assigns tasks only to affinity nodes. It could lead to situation when very huge tasks is processed
by a single cluster node, while other cluster nodes are idle. 
> We should implement configurable map-reduce planner which will be able to utilize the
whole cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message