Intel's Cache Aware Scheduling Nears Linux Kernel Integration

For over a year, Intel engineers have been developing a promising feature for the Linux kernel known as Cache Aware Scheduling. This technology aims to optimize task placement on multi-core processors by considering cache hierarchy, thereby improving performance and resource utilization. After extensive testing on both Intel and AMD CPUs, the patches are now approaching readiness for inclusion in the mainline kernel. This article delves into the mechanics, benefits, and journey of Cache Aware Scheduling toward becoming a standard part of Linux.

What is Cache Aware Scheduling?

Cache Aware Scheduling is a kernel enhancement that leverages knowledge of the CPU’s cache topology to make smarter decisions about which core runs which task. In modern multi-core processors, each core often has its own private L1 and L2 caches, while larger L3 caches are shared among groups of cores (e.g., in a cluster or CCX). When a process or thread migrates between cores, it may lose its cached data, leading to performance penalties known as cache misses.

Intel's Cache Aware Scheduling Nears Linux Kernel Integration

By considering this cache structure, the scheduler can reduce unnecessary migrations and keep frequently accessed data close to the executing core. This is especially beneficial in NUMA (Non-Uniform Memory Access) systems found in many Intel and AMD server platforms, where memory access times vary depending on the physical distance between the core and the memory controller.

How It Works

The implementation involves extending the existing Completely Fair Scheduler (CFS) with additional topology information. Specifically, the scheduler now tracks the last core on which a task ran and attempts to return it to the same cache domain when possible. If that core is busy, it searches within the same L2 or L3 sharing group before considering cores outside. This approach balances load while preserving cache warmth.

Intel’s patches also introduce a new cache domain concept, allowing the scheduler to distinguish between different levels of cache sharing. For example, on an Intel processor with Hyper-Threading, two logical cores share the L1 cache; a cluster of cores may share an L2 cache; and an entire socket often shares the L3 cache. The scheduler can then prioritize scheduling tasks within the smallest shared cache domain first.

The Journey to Mainline Linux

Intel engineers, led by developers such as Tim Chen and Subhra Mazumdar, have been submitting patches to the Linux kernel mailing list since early 2020. The work went through several iterations based on community feedback. Early versions focused solely on Intel hardware, but later revisions generalized the approach to also benefit AMD processors.

Testing has been extensive. The patches have been applied to various kernel trees, including the linux-next integration branch, which is a staging area for features destined for the next mainline release. Results on Intel Xeon and Core i9 systems showed up to 15% improvement in certain benchmarks, while on AMD EPYC and Ryzen CPUs similarly notable gains were observed, particularly in multi-threaded workloads like database and web server applications.

The current patches are in their v10 iteration, having addressed concerns about overhead, scalability, and fairness. As of mid-2021, the code is now being reviewed for inclusion in the sched/core branch of the kernel, which is the scheduling subsystem used by Linus Torvalds to pull into mainline.

Performance Benefits on Intel and AMD

Early benchmarks demonstrated that Cache Aware Scheduling can significantly reduce the number of cache misses. On Intel processors, the effect was most pronounced on systems with multiple LLX (labeled L3 Cache slices) where migration costs are high. On AMD Ryzen and EPYC, the technology improved performance in workloads that exhibit strong data locality, such as fio (file I/O) and SPEC CPU suites.

One notable finding was that the scheduler did not introduce regressions in single-threaded or lightly loaded scenarios. The optimization kicks in automatically when the system is under load, making it a transparent improvement for most users.

Current Status and Future Outlook

As of the latest updates, the Cache Aware Scheduling patches have passed the majority of testing hurdles and are considered close to merging into the mainline Linux kernel. Likely candidates for inclusion are the 5.13 or 5.14 kernel series, depending on final approval from maintainers like Peter Zijlstra and Ingo Molnar.

User adoption will initially be limited to those who compile their own kernels or use distributions that backport the feature. However, once merged, major distributions like Ubuntu, Fedora, and Arch Linux are expected to enable it by default. Looking forward, further enhancements may include dynamic adjustment based on workload characterization and integration with power management (P-state) scaling.

For those interested in testing the patches before official release, Intel provides a Git repository with the latest series, along with detailed instructions. The open-source nature of Linux means that anyone can contribute feedback to help refine the feature before it becomes a permanent part of the kernel.

Conclusion

Cache Aware Scheduling represents a significant step forward in optimizing Linux for modern hardware. By making the scheduler cache-aware, Intel and the community are delivering tangible performance gains without requiring user intervention. The fact that AMD CPUs also benefit demonstrates the universality of the approach. With the patches nearing mainline integration, Linux users can look forward to a more efficient, smarter kernel scheduler in the near future.