Lesson in graphics


I was recently upgrading my laptop with the latest FreeBSD Current. For those who don't know that's the bleeding edge development branch. The rough linux equivalent of running Arch. While it mostly runs OK, you do need to both know what you are doing, but also be aware that some of the commits have the potential to be problematic, as it hasn't gone through as much testing as a stable or release branch.

In this case I updated to the latest commit but could not get the graphics cards working. No matter how many times I reinstalled either from the packages or compiling from ports it didn't work. After many hours of this I discovered/remembered that I had fixed the version in the /etc/make.conf file (due to a previous issue with the lastest drivers). This overrides defaults for building ports. Since currently with FreeBSD you generally have to compile your drm graphics drivers against the current kernel it will potentially be affected by anything in this file. In this case I had the following:

NVIDIA_OVERRIDE_VERSION= 570.153.02

.if ${.CURDIR:M/usr/ports/x11/nvidia-driver} && defined (NVIDIA_OVERRIDE_VERSION)
  DISTVERSION= ${NVIDIA_OVERRIDE_VERSION}
  NO_CHECKSUM= YES
.endif

.if ${.CURDIR:M/usr/ports/x11/linux-nvidia-libs} && defined (NVIDIA_OVERRIDE_VERSION)
  DISTVERSION= ${NVIDIA_OVERRIDE_VERSION}
  NO_CHECKSUM= YES
.endif

.if ${.CURDIR:M/usr/ports/nvidia-drm-66-kmod} && defined (NVIDIA_OVERRIDE_VERSION)
  DISTVERSION= ${NVIDIA_OVERRIDE_VERSION}
  NO_CHECKSUM= YES
.endif

The above code sets the version of nvidia-driver, linux-nvidia-libs and nvidia-drm-66-kmod to the version stated in NVIDIA_OVERRIDE_VERSION which in this instance is 570.153.02. The problem was that version didn't work with the new kernel version I had just installed.

The simple fix was to remove/comment out the code and recompile the three ports. It is also a lesson learned when putting in overrides to not forget you have put them there.

Another override that caught me around the same time was one that had resolved a previous issue when firmware drivers were causing issues, again for my GPU/Graphics. This time it disabled the additional firmware drivers. In this case the following:

hw.nvidia.registry.EnableGpuFirmware=0

was in my /boot/loader.conf.

This reminding me of a discussion in the FreeBSD community about tweaks to your system for whatever reason outliving there usefulness and polluting your system. Similar to cargo cult programming that is an issue in Software Development, that is code we use that we have no idea how it works and whether it even does what we think it does, however we put it in any way because we have in the past.

I suspect this is getting much worse with the use of LLMs to solve issues. In closing, you should always endeavour to understand the settings and code to put in your system, especially if you are a Software or Systems Engineer or any other similar profession. You should also not forget workarounds.