I know this is random, but I found out how to get the abliterated version of Qwen working for prompt rewriter.
1. Open custom_nodes\ComfyUI-QwenVL\gguf_models.json
2. Paste this inbetween an entry, json format.
"Qwen3-VL-8B-Instruct-abliterated-v2-GGUF": {
"author": "prithivMLmods",
"repo_name": "Qwen3-VL-8B-Instruct-abliterated-v2-GGUF",
"repo_id": "prithivMLmods/Qwen3-VL-8B-Instruct-abliterated-v2-GGUF",
"mmproj_file": "Qwen3-VL-8B-Instruct-abliterated-v2.mmproj-f16.gguf",
"model_files": [
"Qwen3-VL-8B-Instruct-abliterated-v2.Q8_0.gguf"
],
"defaults": {
"context_length": 6144,
"image_max_tokens": 4096,
"n_batch": 512,
"gpu_layers": -1,
"top_k": 0,
"pool_size": 4194304
}
},
3. Download the correct llama_cpp version for your python version AND cuda kit. I use python 3.12 and cu130
4. Install it:
Q:\ComfyUI\python_embeded\python.exe -m pip install --upgrade --force-reinstall "Q:\ComfyUI\update\llama_cpp_python-0.3.30+cu130.basic-cp312-cp312-win_amd64.whl"