flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Greg Hogan (JIRA)" <j...@apache.org>
Subject [jira] [Created] (FLINK-3163) Configure Flink for NUMA systems
Date Fri, 11 Dec 2015 18:46:11 GMT
Greg Hogan created FLINK-3163:

             Summary: Configure Flink for NUMA systems
                 Key: FLINK-3163
                 URL: https://issues.apache.org/jira/browse/FLINK-3163
             Project: Flink
          Issue Type: Improvement
          Components: Start-Stop Scripts
    Affects Versions: 1.0.0
            Reporter: Greg Hogan
            Assignee: Greg Hogan

On NUMA systems Flink can be pinned to a single physical processor ("node") using {{numactl
--membind=$node --cpunodebind=$node <command>}}. Commonly available NUMA systems include
the largest AWS and Google Compute instances.

For example, on an AWS c4.8xlarge system with 36 hyperthreads the user could configure a single
TaskManager with 36 slots or have Flink create two TaskManagers bound to each of the NUMA
nodes, each with 18 slots.

There may be some extra overhead in transferring network buffers between TaskManagers on the
same system, though the fraction of data shuffled in this manner decreases with the size of
the cluster. The performance improvement from only accessing local memory looks to be significant
though difficult to benchmark.

The JobManagers may fit into NUMA nodes rather than requiring full systems.

This message was sent by Atlassian JIRA

View raw message