>>108766473
update from >>108724666
rebar script guy here, i had some random crashes, turn out amdgpu would put the r9700 into power saving mode, which would cause the vram to be dumped into ram, but there wasn't enough ram, so the OOM killer would kill systemd for some reason and the kernel would panic.
using modeprobe amdgpu runpm=0 fixed the issue, basicaly disabled power managment.
here is the updated script for anyone that have a bios that doesn't support ReBar.
#!/bin/bash
# use with kernel options "pci=realloc=off" and "pci=nocrs"
echo "0000:03:00.0" | sudo tee /sys/bus/pci/drivers/amdgpu/unbind
echo "0000:06:00.0" | sudo tee /sys/bus/pci/drivers/amdgpu/unbind
sleep 1
# sudo modprobe -r amdgpu
sudo rmmod amdgpu
# multiples of 2, 15 = 32GB, 16 = 64GB etc.
echo 15 | sudo tee /sys/bus/pci/devices/0000:03:00.0/resource0_resize
echo 15 | sudo tee /sys/bus/pci/devices/0000:06:00.0/resource0_resize
# echo "0000:03:00.0" | sudo tee /sys/bus/pci/drivers/amdgpu/bind
# echo "0000:06:00.0" | sudo tee /sys/bus/pci/drivers/amdgpu/bind
sleep 2
sudo modprobe amdgpu runpm=0
# sudo modprobe amdgpu ras_enable=0
sleep 2
# some performance tunning here
echo high | sudo tee /sys/class/drm/card0/device/power_dpm_force_performance_level
echo high | sudo tee /sys/class/drm/card1/device/power_dpm_force_performance_level
sleep 1
echo "rebar done, here's the BAR"
for i in "03:00.0" "06:00.0"; do lspci -s $i -vvv | grep -A3 "Region 0"; echo;done
of course modify the pcie id with those of your card, maybe i could automate it in the future using rocm smi.
anyway, since disabling power managment, things have been rolling smoothly without any crashes, it just works.
as this is still kind of a hack, you may need to restart the amdgpu driver / relaunch the script if you try to use vulkan after rocm though, but i just use rocm for moe as it's much faster, i don't go through them back and forth anyway.