vCPU, thread, core, node, socket. What do CPU terms mean these days?

All Blog Posts

vCPU, thread, core, node, socket. What do CPU terms mean these days?

November 9, 2023 · 5 min read

Ozgun Erdogan

Founder/Co-CEO

At Ubicloud, we’re building an open and portable cloud. One of the first cloud services we released was Elastic Compute; and I was reading a related Git commit messages the other day. It’s interesting how we used to talk about CPUs. Now, we talk about CPUs, vCPUs, threads, cores, nodes, dies, sockets, and packages.

‍https://github.com/ubicloud/ubicloud/commit/d30c157f2f1b2f3d7146824b836da7f4d6aa3973

‍Given the abundance of terms, we wanted to write a blog post that explains what all of these terms mean. Here’s a 5 minute explanation.

Terminology varies between projects

Ubicloud uses Linux KVM for virtualization and the Cloud Hypervisor as our Virtual Machine Monitor. Although these projects are in related spaces, they use different terminology. Here’s how the different terms match up respectively.

* Linux /proc & lscpu: cpu or thread, core, die, socket, node
* Cloud Hypervisor: vCPU or thread, core, die, package, node

So what Linux calls a socket, the Cloud Hypervisor calls a package. Once you know this, you can then map terms across projects.

First, what does a thread, core, die, and socket mean?

Socket is the oldest of these concepts. It's a receptacle, a physical connector linking the processor to the computer’s motherboard. Most PCs in the 1990s had just one CPU and so just one CPU socket. If you wanted another CPU then you needed a motherboard with another CPU socket. Two-socket PC motherboards first appeared around 1995. A processor package sits in this socket and contains one or more dies.

A die is a single piece of silicon that can contain any number of cores. A processor die is where the transistors making up the processor reside.

In the mid 2000s, AMD and Intel started taking multiple CPUs (as they were defined back then) and putting them in the same package. What we referred to as a CPU became a physical core. Total core count today is an important metric for performance.

For example, a dual-core processor is a processor package that has two physical cores inside. It can be either on one die or two dies. Often, the first generation multi-core processors used several dies on a single package. Modern designs put them on the same die; and this gives advantages like being able to share an on-die cache.

Finally, we have threads. These are logical processors that run within the same physical core. Intel popularized this notion in 2002 with "Hyper-Threading Technology." Hyper-threading, or in general simultaneous multithreading, is a hardware optimization to an old problem. Processors are often data-starved, waiting on I/O requests from slower storage. When this happens, the OS context switches, but that costs a few thousand CPU cycles. Hyper-threading solves the context switch problem by having a second set of these super-fast registers already loaded and ready to go.

In most x64 server configurations, the ratio of threads to cores is 2:1.

What is a NUMA node?

Historically, all memory on x86 architectures are equally accessible by all threads. Known as Uniform Memory Access (UMA), access times are the same no matter which thread performs the operation.

Non-Uniform Memory Access (NUMA) tends to intrude when multiple sockets are involved. In a 2+ socket system, each CPU socket has its own memory that it can directly access. But it must also be able to access memory in the other socket - and this of course takes more CPU cycles than accessing local memory. NUMA nodes specify which part of system memory is local to which thread.

You can configure the NUMA in your system to behave such that it gives the best possible performance for your workload. You can for example allow all threads to only access local memory, all memory, or give preference to local memory. This setting then changes how the Linux scheduler will distribute processes among available threads.

Tying this to the above terminology, many processor packages support one to four NUMA nodes. Most configurations interleave all the NUMA nodes supported into one. However, for the benefit of specialized, NUMA-aware workloads, it is possible to decrease memory latency by declining to interleave the memory access of multiple cores.

lscpu - See your CPU architecture


root@Ubuntu-2204-jammy-amd64-base ~ # lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         39 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  8
  On-line CPU(s) list:   0-7
Vendor ID:               GenuineIntel
  Model name:            Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz
    CPU family:          6
    Model:               94
    Thread(s) per core:  2
    Core(s) per socket:  4
    Socket(s):           1
    ...
Virtualization features: 
  Virtualization:        VT-x
Caches (sum of all):     
  L1d:                   128 KiB (4 instances)
  L1i:                   128 KiB (4 instances)
  L2:                    1 MiB (4 instances)
  L3:                    8 MiB (1 instance)
NUMA:                    
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-7
  ...

What about terminology on the cloud?

AWS uses vCPU to indicate VM processing power and defines them as they are explained in this blog post. More specifically:

On x86 architectures, AWS maps one thread to one vCPU and one core to 2 vCPUs.
arm64 doesn't have multiple threads per core. So, AWS maps one core to one vCPU for its Graviton instances.
Burstable instances are different. You get a slice of vCPU-time and can burst above baseline usage assuming you have enough CPU credits.
Azure and Google Cloud also define vCPUs in the above manner. Other cloud providers may deviate from this usage. For example, some refer to burstables as basic or shared VMs.

Summary

CPU architectures came a long way in the past twenty years. We now have terms to distinguish between concepts such as threads, cores, dies, sockets, and nodes. Further, different projects use different terminology. Add to that the public cloud and new CPU architectures such as ARM, things become confusing fast.

If you need to quickly remember these terms in the future, hopefully, this blog post will help. If any of this sounds interesting to you, and you have questions or feedback about Ubicloud, please drop us a line at [email protected]

Next up