summaryrefslogtreecommitdiffstatshomepage
path: root/content/27/index.md
blob: 42084e3ddf28b0795c61fe29cd2e40075bb779f8 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
+++
date = 2024-11-05T23:39:57+01:00
title = "Adding Zero RPM controls to the RX 7000 series"

[taxonomies]
tags = ["contrib"]

[extra]
related = []
+++

Last year I upgraded to an AMD RX 7900XTX, mainly to play Alan Wake II. Just
like my previous card the XTX has a "Zero RPM" feature: it turns off its fans
fully if the junction temperature, the hottest part of the GPU, is below a
certain threshold. With the fans off, the GPU relies on its massive heatsink for
passive cooling. Even in a very well-ventilated case, however, this will mean
that the area around the GPU will heat up considerably. For me the fans turn off
at around 55°C; the component closest to the GPU, an NVMe M.2 SSD, will usually
slowly heat up to around 48°C whilst idling.

Even under load the SSD never exceeds any temperature threshold, so
realistically it should be fine, but I'm simply not happy with the amount of
thermal energy sitting around in there if it could be expelled easily by turning
on the fans. Worse still, the logic for toggling the fans is not very well
thought-out, and in the worst case the fans are on for one minute only to be off
for the next one, ad nauseam.

With my previous GPU turning off "Zero RPM" was pretty simple. Using the
[upp(1)](https://github.com/sibradzic/upp) tool you could toggle the feature in
the GPU's so-called PowerPlay tables. It's a simple job, then, to write a
systemd service to turn off "Zero RPM" on system boot.

Sadly this is no longer possible on 7000 series cards as there is no more direct
access to the PowerPlay tables. Instead a new framework using
[sysfs](https://www.kernel.org/doc/html/latest/filesystems/sysfs.html) for
managing PowerPlay features was introduced. [Fan
curve](https://gitlab.freedesktop.org/drm/amd/-/issues/2402) controls were added
after a while (and a lot of moaning by users), but there was no such knob for
the "Zero RPM" feature. A couple of months ago a [feature
request](https://gitlab.freedesktop.org/drm/amd/-/issues/3489) was opened for
it, but nothing much happened on AMD's side.

Initially hopeful for a reasonably quick resolution, I was getting more and more
annoyed after a while by the lack of this seemingly simple toggle, so I finally
caved and proceeded to have a look at it myself. The hardest part was getting
started with reading
[amdgpu](https://gitlab.freedesktop.org/agd5f/linux/-/tree/amd-staging-drm-next/drivers/gpu/drm/amd?ref_type=heads)
code. The code base is absolutely massive and I had no real idea where to start.
Since fan curve controls already existed I thought it best to find the commit
that introduced them. After a quick search I found [the relevant
commit](https://gitlab.freedesktop.org/agd5f/linux/-/commit/eedd5a343d22) and
had a better understanding of which parts of the code to change.

So, after a while of tweaking and twiddling I had a working prototype and I
could finally have my GPU run its fans at all times. I knew a lot of people were
also waiting for this feature, so I [sent a
patch](https://lists.freedesktop.org/archives/amd-gfx/2024-October/115857.html)
upstream. After some short feedback and [the addition of another
feature](https://lists.freedesktop.org/archives/amd-gfx/2024-October/116274.html)
the series was accepted, and is going to be part of the kernel sometime soon.

With the fans now running at all times I can happily report that ambient
temperatures have dropped by more than 10°C and the SSD usually does not exceed
40°C when idling. Even better I do feel quite proud to have finally contributed
code to the kernel.