Improving The Exynos 9810 Galaxy S9: Part 2 - Catching Up With The Snapdragonby Andrei Frumusanu on April 20, 2018 9:00 AM EST
- Posted in
- Exynos 9810
- Exynos M3
- Galaxy S9
Following our review of the Galaxy S9 there’s been a lot of discussion about both the performance and battery life of Exynos 9810 variants of the Galaxy S9. In the original review I had identified a few key issues with the platform for which I had deemed to be the most negatively attributing to the bad characteristics of the phone. In a first piece following the review I did a few minor changes to the kernel which already seemed to have benefited battery life in our web browsing test, and slightly changing the performance characteristics of the phone for the positive.
In that previous article I noted that there’s a lot to be done to improve the performance of the phone further and trying to optimise battery life. Especially on the performance side of things there were in my opinion very low-hanging fruit in terms of possible changes that would benefit the user-experience.
Focusing on Performance
For this second part I set about trying to recover the best performance possible and matching the Snapdragon 845 variant of the Galaxy S9, while still keeping an eye on battery life.
|Samsung Galaxy S9 (E9810)
Kernel Comparison and Changelog
|Version||Changes and Notes|
|Official Firmware||As Shipped||- Stock setup and behavior
- Single Core M3 at 2704 MHz
- Dual Core M3 at 2314 MHz
- Quad Core M3 at 1794 MHz
|- Optional Samsung-defined CPU Mode in Settings
- CPU limited to 1469 MHz
- Memory controller at half-speed
- Conservative Scheduler
|Custom Config 1||- Start with 'As Shipped' Firmware
- Remove hotplugging mechanism
- Limit M3 frequency peak to 1794MHz at any loading
|Custom Config 2
|- Raise little core frequency to 1950MHz
- Raise big core minimum frequency to 962MHz
- Adapt EAS cost tables based on measured perf & power
- Merge scheduler patches to 4.9-eas-dev (Up to Jan18)
- Backport PELT util_est and use it
- Backport PELT decay rate change to 16ms
- Adapt/disable no longer needed Samsung sched(util) mods
- Minor custom modifications for tuning
|Custom Config 3||- Raise big core frequency to 2314MHz & relevant adjustments|
As a starting point we’re continuing on where we left off in part 1, which was extremely straightforward as the only changes were the removal of all boost frequencies above 1.8GHz on the M3 cores and disabling the online core / hotplugging driver.
In the original review the most evident issue that I identified in terms of badly affecting performance of the phone was the way the device was extremely slow in terms of scaling up in frequency, as well as migrating threads onto the big cores. The original values I described were around 410ms for a steady state continuous workload to actually reach the maximum frequency of the big cores. This was a great contrast to the 65ms of the Snapdragon 845 variant. Setting all other things aside this is what was limiting the interactive performance of the Exynos 9810 the most, so naturally it’s what we want to fix first and foremost.
Scheduling history around EAS
As a little backstory, ever since big.LITTLE’s introduction several years ago the biggest goal for ARM has been to have SoC vendors run the heterogeneous CPUs with a smart scheduler that would be aware of the various CPU’s performance and energy characteristics. This was a fine goal to have but the road to get there has been in my opinion nothing short of a mess. ARM’s approach was to try to do the work in the upstream Linux kernel or within the Linaro workgroup kernel. Unfortunately over the years and delays a lot of the hype that energy aware scheduling (EAS) would bring ended up with a fizzle when it came to shipping commercial devices. I think Qualcomm was on the ball here as even as early as 2015 for the Snapdragon 810, and we’ve covered extensively what the company was trying to do to resolve issues relating to EAS.
A key component to enabling scheduling across heterogeneous CPUs is the ability for the scheduler to actually know the activity and load of individual tasks, instead of only knowing the general CPU utilisation. If you know an individual task’s load, then you can make batter scheduling decisions on which CPU cores to place it. This was originally implemented through the PELT mechanism (Per-entity load tracking) into the Linux kernel and is what was used for migration decisions both in HMP and EAS scheduling.
Exynos 9810 Floor Plan. Image Credit TechInsights
Another long-running goal of Arm and the Linux community was to integrate CPU frequency selection logic within the scheduler, instead of it being a separate mechanism. This was first attempted in a project called schedfreq, and is now fully integrated into a new governor called schedutil. Again the implementation time-scale we’re talking about here was several years, while at the same time we’re seeing several device generations being shipped with a myriad of solutions.
S.LSI’s Exynos chipsets were playing it safe, and up to the Exnyos 9810 the company just chose to stick to a HMP scheduler with a separate interactive cpu frequency governor. Huawei Kirin chipsets ship with EAS, however here even with the latest devices such as the P20, the company foregoes the scheduler CPU frequency governors and falls back to a traditional interactive one (with very good results). Meanwhile Qualcomm has advanced their custom implementation and taken another approach called WALT (Window-assisted load tracking) that is far more responsive to PELT. On the Snapdragon 835 and 845 this is the core mechanism that assures the best performance in terms of scheduling and CPU frequency selection.
Post Your CommentPlease log in or sign up to comment.
View All Comments
jjj - Friday, April 20, 2018 - linkYet to finish reading but for clarity. the memory controller is at half speed only in the Samsung power saving mode and not with your custom configs?
PC Mark clearly does not depend on core perf much and maybe that's what's confusing. It's seen as mostly a CPU benchmark with GPU in photo editing.
And you are kidding about a robotic arm but you only need a moving fingertip with a sensor for the most basic testing and that's easily doable in days. I know Sparkfun has a 40$ IR sensor for robotic fingers but you can go with other sensors too. There are dedicated robots but , to pivot a bit to a slightly different topic, you could test app load times with a high speed phone camera and we would be happy. Wish you guys did that, test load times better than the folks on Youtube - so the bar is very low right now.
Andrei Frumusanu - Friday, April 20, 2018 - linkThe memory controller isn't limited in the custom config.
jjj - Friday, April 20, 2018 - linkIs it possible to disable 1-2 big cores and run config 3, maybe with more aggressive settings?
And why blame the M3 core for web perf and not everything else that might add latency?
Anyway, this core feels like it was aimed at 7nm and it has potential if they improve on it.
ZolaIII - Friday, April 20, 2018 - linkI still say old logic is much more superior (HMP, interactive & core_ctl). Windows load tracking isn't something new it just whose not used much in HMP. EAS is just a big miss. Also seems that 2GHz remains sane limit for sustainable leaking.
serendip - Friday, April 20, 2018 - linkMakes me wonder if this Exynos chip would be better as a midranger like the SD650, with the A55s doing most of the work and the M3 cores running in short bursts. A cut-down M3 with a lower frequency limit could work better; at 2 GHz+ the M3 seems to guzzle power for not much of a performance increase.
What kind of magic sauce did Qualcomm use in the SD845 S9 to get 11 hours of battery life vs 7 on the Exynos S9? I'm getting 11 hours on a Mi Max with a much larger battery so any further tweaks would be welcome.
ZolaIII - Saturday, April 21, 2018 - linkYou are getting 11h SOT in real life usage on an old phone with worn out or at least started to get worn out battery. S9 US version ain't getting 11h of SOT, it's getting 7, 8 at best. Equnos is getting 4 to 6. Check out on XDA. The S710 will be cut down version of S845 & 2GHz is limit for 10 nm second Samsung FinFET, check the graph S845 is even more leaking than Equnos when it's crossed.
dave_the_nerd - Friday, April 20, 2018 - linkWhat's with the splash image of Spock messing with a chainsaw?
Ryan Smith - Friday, April 20, 2018 - linkPart 1 was Spock with a screwdriver. This round of testing was much more extreme, so we had to use a more powerful tool.
SirCanealot - Friday, April 20, 2018 - linkI just noticed the tool! Oh my, I just cracked up! Thanks for the article and the laugh!! :D
Ps, Andrei, I love you. (ie, thanks)
Hifihedgehog - Friday, April 20, 2018 - linkQuite the upgrade from stone knives and bear skins!