I swapped my Mikubox to all P100 16GB PCIe internally, leaving the external 3090s. Despite having to add a thermocouple and PWM channel to my fan controller, and also make a custom power cable for the P100, everything worked
:~$ nvidia-smi -L
GPU 0: NVIDIA GeForce RTX 3090 (UUID: GPU-e2f8cd06-2c7d-accc-728b-62eef1627809)
GPU 1: Tesla P100-PCIE-16GB (UUID: GPU-7da63f72-d5a2-dadb-247a-3880060c84b6)
GPU 2: Tesla P100-PCIE-16GB (UUID: GPU-40205c56-3989-a682-17b2-c2ea90f70e5e)
GPU 3: Tesla P100-PCIE-16GB (UUID: GPU-6537af5d-1095-8402-6c50-d8d9d5afa9b5)
GPU 4: NVIDIA GeForce RTX 3090 (UUID: GPU-34724105-36dd-23ca-3a77-083008f640ec)
Now, last I checked (last week) exllamav2 had a bug with flash_attention and GPUs older than Ampere, so that might be a blocker still.