Jump to content
Facebook Twitter Youtube

[Hardware] Raspberry Pi 5 patch boosts performance up to 18% via NUMA emulation — Geekbench tests reveal gains in both single and multi-threaded performance


Shyloo
 Share

Recommended Posts

BXZoobsEiREy92m7oB3A6m-650-80.jpg.webp

 

Igalia, the free software consultancy perhaps best known for its work on the Raspberry Pi's GPU, has revealed that it is investigating NUMA (Non-Uniform Memory Access) emulation for ARM64 devices. The investigations have so far yielded a potential and significant performance uplift for the Raspberry Pi 5, discussed on a Linux kernel list via a message from Tvrtko Ursulin.

The patch details were posted to the mailing list, and it appears to be around 100 lines in length. However, those 100 lines potentially have a big impact on the Raspberry Pi 5 and many other ARM64 devices. 

According to the post. "This series adds a very simple NUMA emulation implementation and enables selecting it on arm64 platforms."
This improves single-core performance by 6% and multi-core performance by approximately 18%. These figures were determined using Geekbench 6 test runs.


If you own a mouse, play it for 1 minute.
Ursulin explains in a little more depth: "[...] splitting the physical RAM into chunks and utilizing an allocation policy such as interleaving can enable the BCM2721 memory controller to better utilize parallelism in physical memory chip organization."

LATEST VIDEOS FROM TOMSHARDWARE
What could this mean for the Raspberry Pi 5? Overall better performance from an already performant 2.4 GHz Arm CPU, which can be easily overclocked to 3 GHz or more.

 

The code is out for review, and with a little luck and hard work from the Linux Kernel developers, this patch could add even more performance to the Raspberry Pi 5 and many other ARM64 devices.

 

NUMA emulation, mainly used in systems with multiple processors, is a computer memory design where memory access times depend on the memory location that is relative to a processor. In simple terms, NUMA allows each CPU to have its own bank of locally attached memory while still having access to the memory directly connected to other processors in the system. This results in fast latency for 'near' memory (locally attached) but slightly slower latency for 'far' memory (memory directly attached to other processors in the system).  

 

Stay On the Cutting Edge: Get the Tom's Hardware Newsletter
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.


Contact me with news and offers from other Future brands
Receive email from us on behalf of our trusted partners or sponsors
By submitting your information you agree to the Terms & Conditions and Privacy Policy and are aged 16 or over.
The Linux Kernel documentation page goes into NUMA with a little more depth when it comes to the Linux software stack. "Linux divides the system’s hardware resources into multiple software abstractions called “nodes.” Linux maps the nodes onto the physical cells of the hardware platform, abstracting away some of the details for some architectures. As with physical cells, software nodes may contain 0 or more CPUs, memory and/or IO buses. And, again, memory accesses to memory on “closer” nodes–nodes that map to closer cells–will generally experience faster access times and higher effective bandwidth than accesses to more remote cells."

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
 Share

WHO WE ARE?

CsBlackDevil Community [www.csblackdevil.com], a virtual world from May 1, 2012, which continues to grow in the gaming world. CSBD has over 70k members in continuous expansion, coming from different parts of the world.

 

 

Important Links