There are times when you need a clear understanding of how a computer's hardware goes together. In my case, I'm working on a project using XDP to rapidly process packets in the kernel, bypassing the Linux networking stack.

For my purposes in testing and development, it's important to know which network cards are connected to which CPUs. For best performance, packet data should not cross the interconnect from one NUMA node to another, but should instead be handled all within the same NUMA node.

In many cases lshw does the needful, but its output is rather verbose and one can lose sight of the forest for the trees:

char@underbed:~$ sudo lshw | head -n 40
underbed
    description: Rack Mount Chassis
    product: PowerEdge R630 (SKU=NotProvided;ModelName=PowerEdge R630)
    vendor: Dell Inc.
    serial: 5QP0B42
    width: 64 bits
    capabilities: smbios-2.8 dmi-2.8 smp vsyscall32
    configuration: boot=normal chassis=rackmount sku=SKU=NotProvided;ModelName=PowerEdge R630 uuid=4c4c4544-0051-5010-8030-b5c04f423432
  *-core
       description: Motherboard
       product: 0CNCJW
       vendor: Dell Inc.
       physical id: 0
       version: A05
       serial: .5QP0B42.CN7475151G0196.
     *-firmware
          description: BIOS
          vendor: Dell Inc.
          physical id: 0
          version: 2.19.0
          date: 12/12/2023
          size: 64KiB
          capacity: 16MiB
          capabilities: isa pci pnp upgrade shadowing cdboot bootselect edd int13floppytoshiba int13floppy360 int13floppy1200 int13floppy720 int9keyboard int14serial int10video acpi usb biosbootspecification netboot uefi
     *-cpu:0
          description: CPU
          product: Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
          vendor: Intel Corp.
          physical id: 400
          bus info: cpu@0
          version: 6.63.2
          slot: CPU1
          size: 1974MHz
          capacity: 4GHz
          width: 64 bits
          clock: 3705MHz
          capabilities: lm fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp x86-64 constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm ida arat pln pts vnmi md_clear flush_l1d ibpb_exit_to_user cpufreq
          configuration: cores=8 enabledcores=8 microcode=73 threads=16
        *-cache:0
             description: L1 cache
...

There's another useful tool called lstopo that is part of the hwloc package in many Linux distributions. lstopo creates a visualization of the system's layout that filters out less important information and even elides repeating objects to reveal that forest which lshw loses sight of. Here's an example output from the same system as seen above:

lstopo output visualization

Now one can clearly see what PCIe hardware is homed to which NUMA node. Node 0 on the left has sda, my disk drive along with 2 NICs. PCI 03:00 is a 2-port Intel 82599 10GbE NIC that I got for $15 from eBay and PCI 01:00 is a 4-port NIC that came with the server. On Node 1, there's another 2-port 82599 on PCI 81:00.

Along with the peripherals, you can also clearly see each NUMA node has shared RAM, an L3 cache common to all cores, and individual L2 and L1 caches per core.

Pretty neat!