Conversation
Edited 1 year ago
TIL1 - There are three types of scheduling tick reduction mechanism (NO_HZ_FULL, NO_HZ_IDLE, HZ_PERIODIC).

NO_HZ_IDLE reduces scheduling ticks only when a CPU is idle (for energy efficiency). This is the default.

NO_HZ_FULL reduces scheduling ticks when there is only one runnable task per CPU. (used on HPC workloads or realtime applications)

HZ_PERIODIC never reduces scheduling ticks.

TIL2 - The mechnism of the Linux buddy allocator's PCP (per cpu pages) draining was recently changed from interprocessor interrupts, to workqueue, and then remote draining from the CPU that invokes draining. I knew that this exists - but took a closer look during this week's my local kernel study session.

The latest change is quite useful because it does not need to wait for other CPUs to drain their PCP locally, and nohz full CPUs does not need to stop what they were doing and drain their local PCPs.
1
1
2
Edited 1 year ago
Remotely draining remote PCPs was previously impossible because the only synchronization method was to disable interrupts on !PREEMPT_RT kernels. That's why the patch series changed locking from local_lock to spinlock.

One might wonder the change causes performance regressions due to spinlock overhead but the data shows only minor page fault rate reduction.

Interestingly @vbabka seems to be introducing similar mechanism on the new per-cpu opt-in array in SLUB allocator. BTW SLUB does not support draining other CPUs' per-cpu partial list anyway, I wonder if he would try it as well? (ofc I guess not atm)
1
0
1
@hyeyoo given the concerns about spinlock overhead, I'm not sure the future attempts of percpu array will keep this locking scheme ...
1
0
2
@vbabka @hyeyoo is the slab PCP thing the same as the buddy PCP thing? Won't you just get pages from the buddy any way it likes? Or does slab invoke phys alloc with some GFP flag to avoid this?
1
0
0
@ljs @hyeyoo no it's an array of slab objects, not pages
1
0
0
@vbabka @hyeyoo OK so entirely separate from the PCP from the buddy allocator then? Confused because @hyeyoo mentioned buddy PCP as if that would have some impact here but I guess it's an aside?

I had always assumed it was just a separate per CPU list (pity names get reused in the kernel :)
1
0
1
@ljs @vbabka yeah they are not closely related - I just meant the similarity in the mechanism! :)
1
0
2
@hyeyoo @vbabka ah ok haha I was panicking that I had missed some important detail, as usual ;) cheers!
1
0
1
Edited 1 year ago
@ljs @vbabka haha yeah but don't panic even if you miss some details - I miss them very often and it's usual
0
0
3