[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: collage.jpg (3.41 MB, 2400x3163)
3.41 MB
3.41 MB JPG
Discussion of Free and Open Source Text-to-Image/Video Models and UI

Prev: >>106497264

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://rentry.org/wan22ldgguide
https://github.com/Wan-Video
https://alidocs.dingtalk.com/i/nodes/EpGBa2Lm8aZxe5myC99MelA2WgN7R35y

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Neta Lumina
https://huggingface.co/neta-art/Neta-Lumina
https://civitai.com/models/1790792?modelVersionId=2122326
https://neta-lumina-style.tz03.xyz/

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbours
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
>model-00001-of-00010.safetensors
should I stitch everything ?
>>
File: 00036-1448483460.png (749 KB, 1216x832)
749 KB
749 KB PNG
>>
>>106503413
Don't just cat them together each file has a unique header and metadata.
>>
>>106503425
What layer preset do you use? One anon claimed that 'blocks' is superior but slower
>>
comfy should be dragged out on the street and shot
>>
>>106503430
wow new comfy UI looks fire
>>
>>106503449
alright I will find the best way to do it then
>>
File: ComfyUI_00004_.png (617 KB, 512x1024)
617 KB
617 KB PNG
>>106503402
New to this whole comfyui thing, here's my first image made with

CLIPLoader (GGUF) Qwen2.5-VL-7B-Instruct-Q3_K_M.gguf
Unet Loader (GGUF) qwen-image-Q4_k_m.gguf
Load VAE qwen_image_vae.safetensors
>>
>>106503497
Gj anon
>>
File: ComfyUI_temp_exdpx_00003_.png (3.3 MB, 1536x1392)
3.3 MB
3.3 MB PNG
So I've been playing with those Chroma Radiance models (very underbaked I guess, the then latest one was okay, but the one released a day later was absolute shit). I noticed it decides to denoise some stuff it can't recover from. This happened with Euler, etc. I thought, huh, I wish there was some mechanism for it to correct its mistakes.
Anyway, I noticed that Euler ancestral worked much better. I tried it with current Chrome (v48) and it works even better there. Pic related, Euler ancestral and DPM++ 2s ancestral "converge" to the first image, while ~all of the non-ancestral samplers converge to the second image. Keep in mind that I'm using a lora based on a Belgian comic book, so the one on the left is more correct. Give Chroma with ancestral samplers a chance.
>>
I just noticed chroma loads faster than sd1.5, kek. Shame its still slow to gen, even with the 8 step loras.
>>
File: ComfyUI_00293_.png (1.62 MB, 1280x1280)
1.62 MB
1.62 MB PNG
>>106503456
Oh, I'm using diffusion-pipe for Chroma atm. If OneTrainer were in the browser (like Comfy), then I'd probably use it. That being said, I've found block-only training to work well in the past for other models like Sigma
>>
>>106503546
desu it looks all right for something "underbaked", is it slower to use the radiance one over the regular chroma model?
>>
https://github.com/leejet/stable-diffusion.cpp/blob/master/docs/wan.md
bros, anistudio is going to have true multigpu before cumfart. what is this timeline?
>>
>>106503577
>I just noticed chroma loads faster than sd1.5, kek.
absolute cap. actually load both and take screenshots of time to image.
>>
>>
>>106503546
why do these models have such a hard time with swords, it's not like there's lack of training data, but even the best models struggle with drawing swords correctly
>>
>>106503626
i just love big pendulous bosoms so much its unreal bros.
>>
Blessed thread of frenship
>>
>>106503623
Im not doing all that, got gens running. Try it for yourself, I'm using the chroma workflow from https://github.com/ChenDarYen/ComfyUI-NAG
>>
>>106503626
lust provoking nun
>>
>>106503660
unless you're misusing the word "loads" somehow it's physically impossible, sd1.5 is a smaller model than chroma. the smaller one will load faster, simple as,
>>
>>106503468
import pathlib
import safetensors.torch

state_dict = {}
path = pathlib.Path(".")
files = path.glob("*.safetensors")
for file in files:
state_dict.update(safetensors.torch.load_file(file))

safetensors.torch.save_file(state_dict, "merged.safetensors")
>>
>>106503577
>I just noticed chroma loads faster than sd1.5
Unless you are using q4 of Chroma and t5xxl, then not a chance

SD1.5 is 4gb, Chroma checkpoint is ~16gb + txx5 which is 9gb

What probably happened is that the Chroma checkpoint was cached in ram
>>
>>106503714
I'm trying something else but thanks for sharing anon
>>
>>106503691
Try it for yourself or don't lol, anyway, back to genin'
>>
>>106503588
Oh no, the images show v48.
Pic related is from a day before yesterday version of Radiance. The one I currently have on my computer literally cannot generate good pictures, like awful noisy crap.
Radiance is about 1.5x slower than Chroma/Flux for my AMD card. Use the ChromaRadianceOptions node and set nerf_tile_size to 0 or 4k. Doesn't affect VRAM at all on my machine.
>>
>>106503742
Forgot to mention the most important stuff: my v48 trained loras work on Radiance.
>>
>>106503546
prompt? i have a hard time getting disembodied heads to work
>>
>>106502771
>Just ohhh, mmm, hmm that kind of stuff. I think contextually the words in the script helped it figure out what I wanted.
That's actually crazy, I have to try this out. Thanks anon
>>
File: Qwan_00018_.jpg (528 KB, 1984x2976)
528 KB
528 KB JPG
>>106503497
Hope you stick around and post some cool stuff, Anon.
>>106503546
There's so much stuff that works on certain prompts/gens and utterly breaks on others with Chroma, it's kinda ridiculous. Even after genning a thousand plots I'm still not 100% sure on what I would stick with.
>>106503626
Her shadow makes her look like a muppet, heh.
>>
>>106503764
>This is a color illustration of a Miku Hatsune, a girl with teal twintails. wearing a gray dress and a black cloak with a golden clasp. In her left hand she holds a bloodied sword, drenched in blood, dripping with blood. In her right hand is the decapitated head of Teto, a girl with two red tornado curls. She is lifting it by one of the curls. She is walking a stony path on a moor. It is overcast.
I'm having a real hard time to get the neck-gore part to work. These models haven't seen decapitated heads, I'm quite certain.
>>
>>
File: G0Ki0IFXQAA44ZI.jpg (308 KB, 1080x1440)
308 KB
308 KB JPG
Guys a little help, what models can be used to achieve this style?
>>
File: G0JZhaSWUAAocLY.jpg (394 KB, 1080x1440)
394 KB
394 KB JPG
>>106503803
Attaching a few more
>>
>>106503742
The 'fuzzy edges' of Radiance keep getting less prominent, but they're still there.

Has lodestone mentioned how many more epochs he has planned on training, or will he just keep on until it's good or the money run out ?
>>
File: G0LQgTRXQAACfsC.jpg (327 KB, 1080x1440)
327 KB
327 KB JPG
>>106503816
>>
>>106503751
Yeah, this is really great, whether you use VAE or Pixel Space doesn't seem to have any effect on the layers that a lora is trained on, so a standard Chroma VAE trained lora should work exactly as well on Chroma Pixel Space, aka Radiance, at least in theory.
>>
>>106503737
ah ok gotcha so it's a classic case of
>i don't actually understand what's happening but i'm still right and you're wrong anyway kthxbye.
typical.
>>
>>106503822
nta but he seems to have unlimited money. the initial fundraiser only raised 25k but he already spent 6x that. he either has a secret sponsor or he's throwing his entire house into this
>>
>>106503803
>>106503816
>>106503830
ask the weebs in >>>/adt/
>>
>>106503742
oof that still looks rough, let's hope it'll get the result of v48 with more training
>>
>>106503803
>>106503816
proably any noobAI/Illustrious based mix with the right artist/series tags desu. doesn't look particularly obscure or unique, just that kinda late 90s cowboy bebop/eva aesthetic.
>>
>>106503822
With Radiance I've noticed it will not go above the resolution its trained with. With other models it's like 'yeah let's add some artefacts" but with Radiance it's like "here's your upscaled 512x512 illustration". This is my experience with non-photo stuff (I don't really do any photorealistic stuff, so idk).
>>
>>106503869
will do..

>>106503879
thanks!
>>
File: sanaslop.jpg (3.43 MB, 4096x4096)
3.43 MB
3.43 MB JPG
>>106503883
isnt this true with any model (cascade, sana) that claims to support '4k gens'?
>We introduce Sana, a text-to-image framework that can efficiently generate images up to 4096 × 4096 resolution
it's all bullshit, no free lunch. there is no getting around the fact that it's being trained at tiny resolution.
>>
>>106503980
I didn't mean like a proper upscale, just an interpolated one. Like when I look at a Radiance gen, I think "oh, you just gave up". It can look fine, but it's low resolution.
>>
>>
>>106504035
(cont) but my point is that it doesn't really INVENT shit.
>>
>>106503980
sana uses resolution binning to avoid this so even the 1024 1.0 model can do 4k images without deformities albeit with poor fidelity.
you'll only get bodyhorror when you set use_resolution_binning=False desu
>>
>>106504035
I wouldn't jump to conclusions right now, it's really undertrained at the moment, did he say on his discord how much time it's gonna take?
>>
>>106503803
>>106503816
>>106503822
Love the 80's/90's shading. Soon I won't have to put up with basic anime art shit.
>>
File: 2649516.jpg (114 KB, 500x501)
114 KB
114 KB JPG
Can you make a lora from pics that are below 512px? For example for an accessory or a headgear?
>>
>https://docs.comfy.org/tutorials/flux/flux-1-uso
has anyone tried this yet? especially for style transfers/deslopping flux?
>>
>>106504211
>flux
we moved on to qwen and edit does a good enough job already
>>
File: QwanEdit_00001_.jpg (487 KB, 1584x2376)
487 KB
487 KB JPG
>>106503816
Fixed it for this general.
I'm kidding, of course. I like the gens. Are they from some Anon or just random finds?
>>
>>106503770
I can see all the seams of your shitty upscaling job
>>
>>106504263
found it on X, nice conversion, via Qwen Im assuming? Are you the Nazi girl qwen anon btw?
>>
File: 1746041295075611.png (1.37 MB, 1024x1024)
1.37 MB
1.37 MB PNG
>>106504211
it's really bad, the details are destroyed (worse than on Chroma lol)
>>
>>106504228
I use Qwen edit and even with the style transfer LORA it slops styles up.
>>
>>106504285
that gen is focused on identity preservation rather than style transfer though, right?
>>
>>106504307
yeah
>>
>>106504075
I don't really watch his Discord, joined just to confirm I got the correct Radiance fork of Comfy.
But my point wasn't a reproach, just an observation. But now that I've thought about it: Might the 512 -> 1024 switch be even worse for Radiance than it was for normal Chroma? I heard Radiance converges 8x faster (1 week vs 2 months)...
>>
File: 1732127025827636.png (198 KB, 2107x457)
198 KB
198 KB PNG
>>106504359
>I heard Radiance converges 8x faster (1 week vs 2 months)...
on the paper it says it's 8x faster on inference?
https://arxiv.org/pdf/2507.23268
>>
>>106504384
I thought it was based on some 1+ year old stuff, with toy models and this was the first proper model.
>>
>>106504384
it's 8x faster than other pixel methods, but still 1.5x slower than going for a VAE
>>
>>106504432
I think you are an LLM.
>>
>see radial attention got a recent update
>still nothing checked on the to do list

sigh...
>>
>>106504468
says the bot kek
>>
File: heaven_and_hell.jpg (169 KB, 974x1024)
169 KB
169 KB JPG
will we be able to move away from the videos are 5 seconds max because hallucination at some point in the future or is this the max?
>>
File: 28020418.png (734 KB, 832x1216)
734 KB
734 KB PNG
https://voca.ro/19mPhLXOiiml

I can't believe this is what finally got me to install ComfyUI.
>>
>>106504563
I think it'll get better, but that also depends on jewdia
>>
>>106504563
>we will escape last frame
>we will escape +3 second nodes
>we will escape loop n' burn workflows
>we will escape load video to extend
>we will escape 5 second hell

>>106504576
Cursed reply of uncertainty and malevolence
>>
>>106504563
with current model, all we need is to be able to connect multiple 81 frames together with more than one frame as a starting latent to preserve motion
>>
>>106504695
Think https://github.com/bbaudio-2025/ComfyUI-SuperUltimateVaceTools does that but then again, its for vace. Seems to be the best stitch up nodes/workflows I've found. As for being seamless, its a hit or miss
>>
>>106504570
what settings did you use?
>>
>>106504779
Pretty much the default of https://github.com/wildminder/ComfyUI-VibeVoice except using q4 and 20 steps.
>>
>>106504677
you forgot
>we will escape python
>>
Hello, share some sloppy /adt/ Neta Lumina experiments:


-Test between Janku and Neta Lumina only tags: >>106502411
-Test between WAI v15, Neta Lumina, Qwen and Nano BAnana only tags: >>106501820

-Bloat it, and a little meltie: >>106502890
-HiResfix 0.4 denoise >>106504681

Miscelanieus gens using only prose (contradicting the official guide)(spam reason didnt use the other '>'):
>106502492
>106502549
>106502523
>106502694
>106503582
>106503871

Also and more important, Question,
Does anybody knows why its so slow Neta Lumina compared to other "heavy" models?,
Maybe because it's not an SDXL/Flux/Qwen and Comfy isn't using sage attention, xformers, cuda, etc.?
>>
File: 1751215018161775.png (3.54 MB, 1800x1200)
3.54 MB
3.54 MB PNG
when will AI have this level of sovl?
>>
>>106501820
>booruprompting NL models
holy retard
>>
can't seem to get vibevoice to work in comfy..
VibeVoiceSingleSpeakerNode

Error generating speech: Model loading failed: VibeVoice embedded module import failed. Please ensure the vvembed folder exists and transformers>=4.51.3 is installed.

the vvembed folder is there from the git clone and transformers 4.55 is installed

Requirement already satisfied: transformers in ./comfyui/lib/python3.12/site-packages (4.55.2)
>>
>>106504882
Which models do you recommend? I suffer from chronic Vramlet
>>106504891
Official guides recomend boru prompting!
>>
File: load.png (12 KB, 324x189)
12 KB
12 KB PNG
>>106503718
?
>>
Friendly thread of shilling failed models
>>
>>106504898
just use the gradio interface. every audio models in comfy sucks because the interface is all over the place. it really isn't suited to that kind of work
>>
File: u_00028_.png (590 KB, 480x832)
590 KB
590 KB PNG
qwen is only good with a boreal lora, otherwise it's all sameface shit
>>
File: 1735482965721344.png (1.58 MB, 1104x1328)
1.58 MB
1.58 MB PNG
>>106504906
Naa you have to use them in conjunction, the model has no idea what you are trying to convey with just the tags,
Here is "create a image where depicting #jeanne_d'arc_alter_\(fate\) is squatting in front of pot of black roses holding a watering can and watering the roses, she is looking at the roses expressionless, the background is white,"

I am too lazy to look up her outfit tags but yeah you get the idea. I don't know what he is testing on the 2nd and 3rd ones
>slow
Heftier text encoder is my guess
>>
>>106505097
does it help for qwen image edit aswell?
>>
qwen nunchaku lora yet?
>>
Anyone recommend a comfyui that lets me iterate through a list/text file to replace part of the prompt?

Example: "A dancing cat, illustrated by {artist}"
Where {artist} will be replaced by the next entry in a csv file or equivalent for each successive image.

Not useful: wildcard where it replaces the variable with a random entry from the list, rather than iterating sequentially.
>>
>>106505111 (me)
Also you really need to use artist mix since all the artist in this model is underbaked as fuck
>>
>>106504756
yeah something like that but for wan2.2
>>
>>106505152
use the ComfyUI API
>>
>>106504843
I'd bet an uncensored model 10 times the size of Qwen could handle that.
>>
File: Chroma_00003_.png (2.79 MB, 1456x992)
2.79 MB
2.79 MB PNG
>>
I saw something like this
>>
Newbie question
>The most important difference between forge and reForge is that forge works with SDXL models and flux, while reForge is only for SDXL-based models. reForge itself has a few optimizations that aren't present in forge and slightly better extension compatibility, making it the best option for SDXL models between each UI.
I am still not sure how to choose if I want re or non. How do I know if I'll want to run or at least start with flux models?
>>
>>106505195
>>106505152
erm
>>
>>106505084
>gradio
thanks.. didn't know there was one but found one
>>
File: sloppyworkflow.jpg (284 KB, 1669x871)
284 KB
284 KB JPG
>>106505111
>>106505154
Thanks,
>Heftier text encoder
I don't know what is it, but im using the one that Neta Lumina suggests that is Google Hema and Flux vae.
Also i'm using NetaYumeLumina
>>
>>106505214
I love how ComfyUI let me have a txt2img hiresfix in one view
>>
>>106505196
>How do I know if I'll want to run or at least start with flux models?
If you have even a sliver of creativity you'll soon run into the limits of sdxl.
>>
>>106505196
NeoForge, forget ReForge, forget Forge
>>
>>106505214
You didn't actually have to download the gemma model separately, it comes embedded if you downloaded the civit one. Anyway it's basically what takes your words and turns it into the language the model speaks to make your image. Lumina uses a small llm (gemma 2B) and sdxl uses clip which multitudes smaller. Using an llm gives the advantage of understanding the context behind words which gives you better prompt adherence.
>>
>>106505227
it's too busy and panning sucks when inferencing. like the UI has Parkinson's or something
>>
>>106505240
neo forge is still baking, give it time. i would absolutely NOT recommend using it right now as its based on the wildly outdated forge instead of reforge.
>>
File: slopyycompactworkflow.jpg (263 KB, 1814x838)
263 KB
263 KB JPG
>>106505269
I see, if im understandiing I'm bloating my workflow and increasing my processing time by using two Gemma 2B models.
One is already attached, and the other I downloaded and put it in the node. The reason I had to download it is that the imported workflow indicated I needed the Gemma 2B.
>>106505296
more compact!
>>
everytime i get bored of wan i forget how good it understands camera controls and i just nut harder
>>
>>106505319
NeoForge is already operational and supports all models except Qwen.
The only distinction from ReForge is the "++" samples and exotic schedulers that 99% of the time you won't use. I'm using it at the moment, more stable, more speed than ReForge and Forge
>>
>>106505339
unless you use that 4step lora.. then it disregards any sort of camera control whatsoever
>>
>>106505319
ani will steal the thunder before that happens. what would be the point of using a gradio UI if there is a full on desktop app available?
>>
>>106505354
cool. now actually use it with chroma and watch as the memory management makes you sit there like a dumb cunt for minutes.

look, i'll give you that it works completely fine for SDXl but anything else is out of scope.
the controlnet it uses is completely out of date and missing several things.
>>
>>106505357
shut the fuck up, grown ups are talking. go back to adt
>>
File: Chroma_00012_.png (2.4 MB, 1141x1404)
2.4 MB
2.4 MB PNG
>>
>>106505335
No you should be fine, in comfy you load the TE and Unet separately, so what you are doing is just using another copy of the TE instead of using the one in the model,. Processing wise it doesn't matter, not much you can do about the speed other than maybe get sage running (which increases the speed around 30-is%) but honestly model is just slow
>>
>>106505383
you are mad because I am right
>>
>>106505371
>the controlnet it uses is completely out of date and missing several things.
Please explain this
>>
>>106495787
>>106495819
>>106495834
I really want to know what tags/lora anon used to get this aesthetic out of noob
>>
File: moarsloppymoarkompact.jpg (390 KB, 1914x961)
390 KB
390 KB JPG
>>106505398
>sage running (which increases the speed around 30-is%)
How can I achieve that? I believe it will help me with using SDXL, WAN, or Qwen as well, thank you,
>>
>>106505443
Also is there an option no magnet the nodes between then for more COMPACTNESS
>>
File: Chroma_00015_.jpg (525 KB, 1160x1496)
525 KB
525 KB JPG
>>106505401
Beksinski test
>>
>>106505443
SDXL doesn't work with sage and qwen and neta generate black screen. But for wan it's huge.
>>
>>106505486
OneTrainer, 16gb Chroma preset. 0.0003 LR adamnw8bit + cosine. Batch 2, per step 3. Training resolution 512. Images tagged with JoyCaption beta one + some booru tags
>>
File: Qwan_00031_.jpg (663 KB, 1984x2976)
663 KB
663 KB JPG
>>106504268
Damn, I changed something about my latent rescale code last evening and I might have fucked something up.
There shouldn't be any seams, because I'm not tiling at all.
It might also have been a Wan quirk, sometimes it really goes overboard with backgrounds. I dunno!
>>106504275
Yeah, hi. Have a dipsy.
>>
File: Chroma_00022_.jpg (575 KB, 1160x1496)
575 KB
575 KB JPG
>>
File: het_beest.png (36 KB, 222x154)
36 KB
36 KB PNG
What am I missing about this? I got WebUI working and seem to only be able to create deformed beasts. I just want to generate realistic human portraits.
>>
>>106505587
how about generating at normal resolution and then downsizing
>>
File: ComfyUI_temp_lczat_00001_.png (2.47 MB, 1536x1152)
2.47 MB
2.47 MB PNG
Does lora rank matter for training character/style or is it only good for the distill experiments and other memes?
>>
>>106505597
This is a cropped 512x512 image in any case. What checkpoint/sampling method should I be using?
>>
>>106505618
What model are you using?
>>
>>106505623
JuggernautXL
>>
File: 621037~01.jpg (23 KB, 303x362)
23 KB
23 KB JPG
>>106505668
what the fuck is even that?
>>
>>106505668
use qwen
gen 1024x1024
downscale to 256x256
>>
File: Qwan_00032_.jpg (630 KB, 2496x2496)
630 KB
630 KB JPG
>>106505587
That's my passport picture you freak, stop doxing me.
>>
>>106505192
Very nice, there's great texture when using Chroma for art
>>
What does it mean when it just generates a matte block of color?
>>
File: Chroma_00032_.jpg (629 KB, 1116x1488)
629 KB
629 KB JPG
>>106505710
I let it run for 100 epochs, little bit under 3 hours. I think I better up the training resolution

>>106505819
It's good, but I think the Image Contrast Adaptive Sharpening node from essentials is pretty much mandatory with some styles. It's just too blurry without
>>
>>106505689
NTA but should I not be genning above 1MPixel with Qwen? Their own examples show super high resolutions being used and I get much better results when I push it to 1536x1536 (can't go any higher than that because VRAMlet)
>>
>>106505192
Is this darkseed
>>
>>106505865
What model? For qwen and neta you need to remove the sage attention from the bat file
>>
>>106505891
sure
>>
>>106505476
>SDXL doesn't work with sage
what about pony, illustrious and noob?
>>
>>106505898
For qwen
>remove the sage attention from the bat file
I don't know what this means
>>
>TFW I don’t have voice samples of my oneitis to run in vibevoice


WHY MEEEEEEEEEE
>>
File: run.png (38 KB, 1177x252)
38 KB
38 KB PNG
>>106505915
The launch file nigga. Also if you are using qwen regularly make a permanent copy with just the one parameter removed. I think comfy has been working on hotswapping sage, but idk if it's done yet.
>>
should i be using the default lanczos upscaler for Wan or is there one better you guys recommend?
>>
>>106505414
someone made a issue. what part do you not understand? it is missing crucial parts of ip-adapter that make it useless. you can't select if you want style or composition so instead you get nothing when using it.
>>
Not enough chroma hate in this thread so I'll catch us up.

1. Chroma failed. The issues that were apparent early on were never fixed.
2. Lodestones is incapable of seeing a project through to completion and is incapable of documenting his process. His delusional sycophants see this as a positive.
3. Lodestones is now using donated funds to train whacky concepts with no hope of viable output. Like a qwen edit style edit chroma model and a vae less chroma. He may as well be burning cash on live stream.
>>
>>106505990
keep crying while i jork to chroma outputs.
chroma is retarded in many ways yet it's still better than flux.
>>
>>106505990
AntiChromaSchizo was right all along.
>>
>>106506009
Not that guy but I train 30 steps for 100-150 epochs. But still testing since there is no actual proper info for this.
>>
>>106505990
>Same copypasted bait from the inception of Chroma

A better bait was
>Qwen is better
But Qwen is still leagues behind Chroma and the only evidence of it being "better" was slopped images. You still don't get the prompting freedom, variety or photorealism out of a pure txt2img prompt on Qwen, and you certainly can't just train a LoRA either as that LoRA would still have the same issues.
>>
Interesting video on woct0rdho's radial attention on 12gb card: https://www.youtube.com/watch?v=tNXdSnP-Tdc

and looks like he's been updating it https://github.com/woct0rdho/ComfyUI-RadialAttn going give it another go later on
>>
>>106505990
4. Now Lodestones begs /ldg/ anons to use their GPU to train his failed model like a total narcissistic furry.
If it's not a competitive model, at least it should be a cult model.
>>
>>106505990
>3. Lodestones is now using donated funds to train whacky concepts with no hope of viable output. Like a qwen edit style edit chroma model and a vae less chroma.

These are actually the best things he's been working on. chroma itself is hopeless because of 512, but these experiments could produce useful knowledge for everyone.
>>
>>106506034
Is it still hardlocked into specific resolutions?
>>
>>106506044
just disable samples and free your mind
>>
>>106505948
why are you shilling comfyui so hard?
>>
>>106506009
>No repeats? Or does Chroma just necessitate many many steps
I've mostly trained just one concept so there's no need for repeats.
>>
>>106505914
anon those all use sdxl as a base
>>
>*anon announces his post is bait*
>*other anons immediately take said bait*

Why are you like this.
>>
File: Chroma_00039_.jpg (499 KB, 1160x1496)
499 KB
499 KB JPG
>>
>>106506034
considering no one talks about this, i assume it was snake oil/another meme speed up
>>
>>106506077
You know a LoRA is overfitted when it performs exceptionally well on the training data but struggles to generalize to new, unseen prompts and images, often exhibiting artifacts, oversaturation, or a lack of flexibility. You can identify overfitting by using validation sets to test performance, observing learning curves that show divergence between training and validation performance over time, or by generating a grid of images across different epochs to spot deteriorating quality. Symptoms like an "burnt-out look", image distortion, or the inability to generate varied poses or styles outside the training set are strong indicators of a LoRA that has become too specialized.
>>
>>106506081
>Still that anon btw
>>
>>106506077
hmm, sounds like you just like the fear and anticipation.
>>
>>106506081
You may think it's bait, but those were my genuine thoughts
>>
File: Qwan_00035_.jpg (737 KB, 2760x2064)
737 KB
737 KB JPG
>>106505891
If you have kjnodes installed, there's a 'empty latent custom presets' node.
In the custom_nodes folder of kjnodes there's a custom_dimensions.json.
Plop these in there:
    {
"label": "QWEN 1:1",
"value": "1328x1328"
},
{
"label": "QWEN 16:9",
"value": "1664x928"
},
{
"label": "QWEN 9:16",
"value": "928x1664"
},
{
"label": "QWEN 4:3",
"value": "1472x1104"
},
{
"label": "QWEN 3:4",
"value": "1104x1472"
},
{
"label": "QWEN 3:2",
"value": "1584x1056"
},
{
"label": "QWEN 2:3",
"value": "1056x1584"
}
>>
File: Chroma_00041_.jpg (521 KB, 1160x1496)
521 KB
521 KB JPG
>>
File: rad.jpg (71 KB, 964x623)
71 KB
71 KB JPG
>>106506048
Watch the video, it explains it but yeah looks like its still locked. Here's the width, height and lengths that work apparently (picrel)

>>106506092
Kek the video just came out yesterday
>>
>>106506092
Radial iirc was for better longer videos.
>>
I just started messing with OneTrainer so forgive my dumb question. Do I need ALL the files from pic related in addition to the actual checkpoint or is there a specific file that I should get instead?

I ask because if I point OneTrainer to just the base model by itself, it does not work. But if I point it to the same file from within a folder that houses all those files, it works. I want to know what file/folder is allowing me to proceed so I don't have to download everything if I don't have to.
>>
>>106506163
sure but what is the actual speed up, and how bad is the quality loss?

show me
>>
so we escape the hell of 5 second videos, do you have enough vram to prompt them?
>>
what do you guys prompt all day? your waifu?
>>
>>106506231
If It were /adt/, I would say yes, but here we mostly try and test things in general. Sometimes they are blue balls like, and other times they are purely technical.
>>
>>106506183
Holy shit you are beyond retarded. Read the fucking github https://github.com/woct0rdho/ComfyUI-RadialAttn woct0rdho's version is NOT about speed. Also look at the video https://www.youtube.com/watch?v=tNXdSnP-Tdc

Wipe your own ass for once.
>>
>>106506277
>woct0rdho's version is NOT about speed
??
>>
>>106506248
Yeah, I think I spend more time tinkering with new tech and Comfy than actually prompting anything.
>>
>>106506166
I would use easy training scripts for sdxl
>>
>>106506277
i'm not watching you video cheng.

> woct0rdho's version is NOT about speed.

are you dumb as fuck?
>>
radial is not about speed retard
>>
File: Qwan_00037_.jpg (759 KB, 1984x2976)
759 KB
759 KB JPG
>>106506231
Usually I just gen a lot of plots and plop them in here, or try new models to compare... and then plot them again.
But lately I've been very stuck with genning random 1/2/3girls after being stuck on Wan for a bit.
Even if Qwen is slopped, it's a very fun model, and I barely touched any LoRAs yet.
Image Edit is also fun, but I don't really know what to do with it anymore.
>>
>>106506325
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

This?
>>
>>106505865
>What does it mean when it just generates a matte block of color?
Take "Rothko" out of the prompt.
>>
>>106506357
Yeah
>>
Chroma is a pretty good model, I'm really enjoying using it. The problem is that simply doesn't live up to SD1.5.
>>
>>106506308
>>106506337
>retards cant figure out the main focus is to make longer more coherent videos
>>
>>106506393
it does both retard, it speeds up the inference + makes longer videos more coherent
>>
I think you get the best results with Chroma when you gen with Chroma and then use Qwen to img2img refine the output. That way you get the hyperrealism that Qwen gives you.
>>
>>106506393
>still needs certain resolutions and lengths

>>106506426
unironically this. i love making crazy shit in chroma that qwen simply doesn't understand and then denoising it at a low strength for a better look.
>>
File: Chroma_00053_.jpg (711 KB, 1160x1496)
711 KB
711 KB JPG
>>
>to use comfyui please make an account
>please enter your phone number and credit card to make an account
it's going to get to this and you know it
>>
Are these settings good for training in OneTrainer for Illustrious? I did some tests and I sometimes have to increase the strength of the LoRa to 1.2 or more for the likeness to appear.
>>
>>106506409
Please keep these coming, very good style and non-ai like
>>
>>106506449
Somebody would either fork ComfyUI or it would be replaced by another UI, at that point.
>>
>>106506449
that's fine anon, the git is still public so we can simply rip all that shit out.
now.. the git going read only is when people should start to worry.
>>
>>106506445
Are these base model Chroma or are you using a lora ?

That fine detail comes through very well I must say.
>>
>>106506449
I think it's more like blender desu
>>
File: file.png (1.1 MB, 1024x1024)
1.1 MB
1.1 MB PNG
qwen image 80s on a 3060 @110w
>>
File: file.png (650 KB, 1024x1024)
650 KB
650 KB PNG
>>106506512
damn..
>>
>>106506528
what were the settings for the failbake? and was that with onetrainer? i have stuff i want to train but i am hesistant.
>>
>>106506509
does blender make addons you have to pay for? I thought it was a non profit foundation not a greedy corp
>>
>>106506459
>>106506461
this will happen because comfyorg will go public eventually. it would be at the peak of it's popularity
>>
>>106506588
Blender itself? I dont think so, but there are thousands of paid blender addons
>>
>>106504840
>why its so slow Neta Lumina compared to other "heavy" models?
It should be as heavy as SDXL theoretically but the model has not had much if anything done to its code after having been copy pasted from the research repo in comparison with other models in ComfyUI. Doing torch.compile on the model shows how much faster it could get, I get a 2 to 3x speedup. Better hope someone steps up to the plate to do that work, ain't easy.
>>
>>106506599
>bad things could happen in the future, i must theorize and doompost about it so i can get those delicious (you)s
>the end times are coming!! THE END TIMES ARE COMING!!!
kek
>>
>>106506599
>the most techworker-saturated goon hobby
>thinking you could lockdown anything from them
lol
>>
>>106506038
>but these experiments could produce useful knowledge for everyone.
or the opposite: because it's being trained on 512 it will look like shit regardless, leaving the actual effectiveness of these techniques unclear
>>
>>106506641
for chroma right? why are you training both the TE and unet? pretty sure the TE is turned off for a reason
>>
fine, i'll try radial attention.

i think no one talks about it because it has retarded requirements, the prompt and resolutions need to be super specific which sucks dick.
>>
>>106506798
skipped that bit somehow, my b.
>>
>>106506787
Aren't anons ITT just stitching the videos with last-first frame?
>>
>>106506817
not sure how that's relevant but possibly.
not sure how they aren't getting huge degradation in quality though. there was that one video where it was a pokemon character slowly undressing over 40ish seconds and it didnt have any vae bullshit going on
>>
>>106506627
all of those are third party. the problem with blender is the licence which also holds back comfyui. both are gpl3 so saas faggots are the only ones who are able to commercialize the code
>>
Is there a way to get hires.fix to work in comfyui like it does in forge? All the workflows i've tried work differently and i'm not able to set the upscaler and it's denoising strength.
>>
>>106506869
no
>>
>>106506869
Here you go, complete the rest of the nodes with the model of your choice
https://files.catbox.moe/xld1sp.json
>>
>>106506451
if you are the /anon/ from adt, an anon there shared an screenshot when he trained his yuri girl lora some threads ago
>>
File: 001.png (147 KB, 1574x778)
147 KB
147 KB PNG
>>106506902
>complete the rest of the nodes with the model of your choice
thanks for the json, i have no idea what the fuck im doing with nodes yet so i'll try to frankenstein it by connecting the dots and see if that works
>>
>>106506925
>if you are the /anon/ from adt,
Nope, i've never even gone to adt but i'll take a look. Do you have the screenshot or thread number by any chance?
>>
File: Qwan_00004_.jpg (861 KB, 1984x2976)
861 KB
861 KB JPG
I think I'm nearing my end of the 'Qwen' look, but it takes fashion prompts so incredibly well that it's kinda hard to stop.
>>
>>106506943
Middle portion is where the upscale happens then you encode to a latent and send that to the ksampler where you can set the denoise and steps and stuff like in forge.
I left the links open cause they change depending which of the model you are using if XL just have checkpoint loader and vae loader and connect as needed. If others you add whatever they need to load lol.
>>
File: 1756586426561487.jpg (248 KB, 1892x1248)
248 KB
248 KB JPG
>>106506925
>>106506965
i think i found it
>>
>>106506787
anon here

it seems okay? i'm really not seeing the benefits yet but i've only gen'd one video
>>
Comfyui has this really cool new feature where if you leave the tab open long enough it will start gobbling up sys ram and raping your cpu.
>>
>>106507000
Did you use 2.1 or 2.2? I cant get it to work with any of the all-in-one 2.2
>>
>>106507048
Yeah I always thought my pc was too fast, glad comfy is giving us the features the real ones want.
>>
File: ComfyUI_00155_.mp4 (607 KB, 480x832)
607 KB
607 KB MP4
>>
>>106506077
What you're supposed to do is dump all of the lora epochs into Forge and do X/Y grid tests for various prompts and figure out which ones fit the style best with the least amount of artifacts or burning. The training samples usually suck and gives different type of outputs from my experience.
>>
File: ComfyUI_00269_.mp4 (657 KB, 640x640)
657 KB
657 KB MP4
>>
File: file.png (12 KB, 708x279)
12 KB
12 KB PNG
>>106507049
because the all-in-one is shit.

also, i take it back, i lowered the resolution and just keep getting errors because the res and prompt isnt specifically how it wants it.
for fairness i'll try it again in a higher res which i guess is where you'd use it but.. so far there are zero speed increases. the light loras are doing all the work while radial just lies there like a bored whore.
>>
File: hiresfix.png (87 KB, 709x837)
87 KB
87 KB PNG
>>106506902
Ahem... *clears throat* dear anons, I want to say that I LOVE FORGE, it is the best UI I've seen so far and I will NEVER, I repeat, NEVER touch a node in my life, did you hear me? NEVER!
>>106506943
>>106506992
Why do you want to use hiresfix in Comfy? Is absolute inferior, they are people in redit commenting on this.
Comfy has never figured out how lllyasviel managed to do hiresfix. He has just been coping, creating around 50 different hiresfix nodes and upscaler nodes, and upscalers of upscalers, all without any success.
>>
>>106507077
>the screenshot
>>
File: ComfyUI_00270_.mp4 (796 KB, 640x640)
796 KB
796 KB MP4
>>
I LOVE BLOAT!!!!! I LOVE PYTHON BLOOOOAAAATTTT!!!! WEB DEVELOPMENT IS THE BEEEEST!!!!
>>
>>106506989
I'm slowly figuring it out, i'm VERY new to nodes so i'll just download some workflows and rip the basics from them. I should be able to figure it out, hopefully I don't brick my comfyui with too many nodes lol
>>
File: ComfyUI_temp_gkyms_00002_.png (1.76 MB, 1152x1152)
1.76 MB
1.76 MB PNG
So organic...
>>
SNAKES SNAKES ON MY WALL SNAKES UP MY ASSSS
>>
File: ComfyUI_00271_.mp4 (3.54 MB, 640x640)
3.54 MB
3.54 MB MP4
hey at least it isn't all programmed in Go
>>
>>106507075
Cant even get this to run, it just throws errors and this is with the provided workflow kek
>>
why is girls smoking so fucking kino in art but trash irl?
>inb4sloppa
yeah i dont care i also like burger king.
>>
>>106507114
chroma version?
>>
>>106507129
HD Q8
>>
>>106506310
See this in a lot of hobbies. I’m also into 3d printing and there’s definitely a split between “working on the printer is the hobby” and “end products are the hobby I just want the printer to work and be unobtrusive”. Perhaps it’s human nature.
>>
>schizo won
grim
>>
>>106507122
default wf works, it just has stupid model naming so make sure to replace them with shit that makes sense.
but honestly doesn't seem worth it. too autistic as it requres exact resolutions
>>
File: Qwan_00007_.jpg (758 KB, 1984x2976)
758 KB
758 KB JPG
>>106507100
You sneaky fuck.
I appreciate you animating my slop.
>>
>>106507106
But enough about comfy
>>
>>106507123
it's just photogenic, looks good in movies which influenced the generation that changed its image from glamourous to trashy
plus it stinks
>>
File: eps.png (1.7 MB, 832x1216)
1.7 MB
1.7 MB PNG
Anyway, Euler ancestral fixes like 50% of Khroma's hand issues.
>>
>>106507109
>hopefully I don't brick my comfyui with too many nodes lol
don't worry, updates do that just fine
>>
>>106507165
was he not talking about comfy? lol
>>
>>106507171
thank you for reminding me to try this
>>
>>106507172
i do not plan on updating unless absolutely necessary and i have a backup ready in case i brick it. i just dont want to brick it since im a noob
>>
>>106507179
I recommend using the Euler ancestral node and pumping eta up to 1.5.
>>
>>106507190
are you implying pros want to brick or something?
>>
>>106507191
with the clownsharksampler? i'll try. still currently fighting against radial attn but it's turning into a waste of time. could have generated so much shit in the time it took me so far.
>>
File: 8dzcdp.jpg (81 KB, 690x646)
81 KB
81 KB JPG
Is anyone else having issues with ComfyUI? It seems to interfere with my prompts. I want to create prompts, but something always happens with Comfy that prevents me from doing so, wasting my free time fixing unexpected problems.
I feel like pic related
>>
>>106507190
If you just installed it's probably too late current version is pretty fucked memory wise. Anyway should be ok if you are just doing XL stuff at least.
>>
>>106507197
>something
no please, do be more vague.
my issues with comfy so far have always been in front of the monitor.
>>
>>106507196
No, I meant the vanilla Euler ancestral node.
>>
>>106507197
why else do you think there has been more outspoken hate for the griftui?
>>
>>106507169
it really is such a great aesthetic
>>
How many steps does WAN need without the 4steps lora?
>>
>>106507197
waiter, waiter, more downscaled jpegs
>>
>>106507197
Did you create your ComfyAccount? So we can telemetry your nodes¿
>>
>>106507216
tree fiddy
>>
File: -1309939556.gif (291 KB, 700x704)
291 KB
291 KB GIF
>>106507219
>¿
of course...
>>
>>106507216
It defaults to 20 per sampler, with 25 giving better results generally. You get artifacting starting around 15 when I was experimenting weeks ago.
>>
>>106507216
20 if you go by the comfy workflow, officially it's 40 steps. for the 5B version it's 50.
>>
>>106507197

I have been here for years and have never seen a WebUI bug issue, no one saying "I don't know what's wrong with Forge when I do x thing, ..." and this is a UI that has no maintenance.
>>
>>106507236
>>106507243
So 20 per sampler ... No wonder there is such a strong push to shill the lightning lora
>>
>>106507197
Why is it that in every thread (excluding the ComfyHaters) there is always someone asking for help with Comfy?
There is always some dependency issue, always some Sage Attention that isn't working as it should, always some black screen output.
What's going on?
This is excluding the ComfyHaters and focusing on the number of people asking questions about Comfy because things aren't working.
>>
>>106507193
of course not, but noobs like me are more prone to bricking something through lack of knowledge or retardation
>If you just installed it's probably too late current version is pretty fucked memory wise
pic related is my version, is that the one that's fucked?
>>
>>106507244
sometimes people would kick and scream about an oom issue but it was mostly people being retarded
>>
File: postcard.png (288 KB, 868x321)
288 KB
288 KB PNG
>>
>>106507202
meant to quote you here
>>106507257
>>
>>106507257
I’ve seen this thread the last little while every so often say that 3.50 was the one that seemed to turbofuck the memory issues but I don’t use comfy so I dunno lel
>>
>>106507257
See? this is not a ComfyHater, this is another Comfy user witha problem
>>
>>106507262
never understood why you are such a massive faggot
>>
File: file.png (15 KB, 857x348)
15 KB
15 KB PNG
yeah fuck radial, i'm over it. it refuses any resolutions it should be able to take (according to gemini and gpt)
>>
>>106507255
greed is the motivation. the project is under control by an ex Google chink scammer and ever since there has been a lot of problems. it's not that people want comfyui to be replaced but that it must be replaced to stop the headaches, bloat and griftmaxxing
>>
>>106507212
definitely understand where smokeules anon was coming from
>>
>>106505990
chroma works for me since v2X, sorry you are retarded
>>
>>106507284
Don't you see? Another Comfy user, we are excluding haters (like me) and focusing on the real users with real problems
>>
>>106507262
he lives rent free in ldg brains to this very day
>>
>>106507255
because its the main ui people use and every software project under the sun was always a buggy mess?
>>
>>106507274
>>106507296
nah, you are just a comfy hating schizo. there is nothing wrong with my SOTA node spaghetti
>>
>>106507266
Yeah the problem started a few versions back but that one still has it. You can try it yourself there seems some people still unaffected it.
>>
File: ComfyUI_01075_.png (1.52 MB, 1024x1536)
1.52 MB
1.52 MB PNG
>>
>>106504843
Out painting can probably do it. Simple prompts, who knows?
>>
These threads are still consistently the most brainless thing on all of /g/.

If it gets contained to one thread then I can call it a "containment thread", but otherwise I'm legit just one step away from thinking that the mods should ban it.

Move it to /s/ or /a/.
AI is technology and you use AI to generate images of fictional anime characters with big cans by typing "anime characters with big cans". Therefore it belongs on the technology board.
Trucks move food so let's have a thread about trucks on /ck/.

I'm convinced that about 1% of the people here even have a passing interest in how AI works and how this mistifying computer program is able to able to create a photorealistic image of Chun-Li eating a big mac with Peter Griffin. You click the button.
>>
>>106503803
Greatest anime art style in the history of anime.
>>
>>106504570
pretty good



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.