NUMA 架構

最後更新: 2022-02-07

介紹

NUMA: Non-Uniform Memory Architecture

In a NUMA system, processors, memory, and I/O are grouped together into nodes/cells.

The memory in NUMA systems is physically distributed but logically shared.

Under NUMA, a processor can access its own local memory faster than non-local memory

Each of the ‘cells’ may be viewed as an SMP [symmetric multi-processor]

(Cell 1)                   (Cell 2)
CPU 1 --- Inter-connect --- CPU 2
 |                           |
RAM 1                       RAM 2

Inter-Connect

  • AMD: Hyper Transport (HT) interconnect
  • Intel: Quick Path Interconnect (QPI)

應用: Super Computers (解決多 Core 到 Memory 的 bottleneck)

目錄

  • Nodes Info.
  • BIOS Setting: Node interleaving

 


Nodes Info.

 

# numactl - Control NUMA policy for processes or shared memory

apt-get install numactl -y

numactl --hardware

available: 1 nodes (0)
node 0 cpus: 0 1 2 3
node 0 size: 15857 MB
node 0 free: 1346 MB
node distances:
node   0
  0:  10

System Locality Information Table (SLIT)

# 透過 ACPI (Advanced Configuration and Power Interface) 話比 kernel 知

 * It gives the normalized "distances"

node   0
  0:  10

 


BIOS Setting: Node interleaving

 

only on NUMA(Non-Uniform Memory Access) architectures

Default: disabled

Node interleaving essentially lets the CPU decide where to put the memory, disabling it means that the user must explicitly tell where in memory to put data so that the associated CPU gets best performance.

NUMA(Non-Uniform Memory Access)
Each node contains both processors and memory
When a processor accesses memory that does not lie within its own node (remote memory), the data must be transferred over the NUMA interconnect

benchmark: AMDAPPSDK

* non-uniform = access time

Disable => Resource Allocation Table (SRAT)
Enabled => UMA =>  no SRAT ( ESX will be unaware of the underlying physical architecture )

* 所以 Disable 後 VM 係會行得快 D