[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>106978567

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Neta Lumina
https://civitai.com/models/1790792?modelVersionId=2298660
https://neta-lumina-style.tz03.xyz/
https://huggingface.co/neta-art/Neta-Lumina

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
>sneething so much he only his shitty gens in the OP
real bake when?
>>
>>106981016
kino collage
>>
Blessed thread of frenship
>>
File: 324234.mp4 (2.55 MB, 688x464)
2.55 MB
2.55 MB MP4
>>
File: 1753533307382025.mp4 (3.55 MB, 816x656)
3.55 MB
3.55 MB MP4
>>106980892
>>
>>106981043
it's a shame wan fucked up the lettering but nice otherwise
>>
File: 1747365534622833.mp4 (3.63 MB, 848x624)
3.63 MB
3.63 MB MP4
https://www.youtube.com/watch?v=HKn8UWvEraE
>>
>>
I dedicate this thread to total SaaS victory
>>
File: QwenEdit_00220_.png (1.44 MB, 1024x1024)
1.44 MB
1.44 MB PNG
>>
File: kek.png (509 KB, 640x640)
509 KB
509 KB PNG
>>106981145
Like Albert Camus said:
>"One must imagine SIMPhius happy"
>>
>>106981016
>>
>>106981145
lol
>>
File: 1755169091399189.png (13 KB, 327x340)
13 KB
13 KB PNG
is there some way to generate additional information on top of an image, similar to the way a detailer works? i'm not talking about masking with denoise to redraw parts of the image. i mean more like drawing on top of a layer without changing what's underneath, but still using the image underneath as context. A completely random example would be, say a face, where i want to put white liquid on top of it. Masking with a detailer will require me to put a high denoise, which will redraw the face underneath. This is just a completely random example, and this specific problem probably has a specific solution, like combining a face lora with a white liquid lora, but i'm looking for something broader that can cover a ton of different cases. Maybe stuff like flex does this? But i would prefer not to put an additional unet into my workflow.
>>
>>106981309
The closest thing is edit models, of which you would need to train a lora for. Second option is roughly painting it in yourself and doing multiple inpaint/img2img passes until it looks decent, then layering it on top of the old image and selective erasing to restore face details
>>
File: 1744048327017382.jpg (1 MB, 1416x2128)
1 MB
1 MB JPG
>>
>>106981309
You could maybe work something with the Krita extension since it uses layers.
>>
>>
>>106981336
nta but i could never get the piece of shit to work properly according to the creators video's and following everything exactly. It seemed very limited compared with just copy paste image into gimp, make edit then copy paste back into comfy with out having to fuck around with some tool. maybe i just installed it when it was bugged or something.
>>
>>106981509
majority of local copium (ipadapter, regional prompter, rescale cfg) is complete snakeoil trash.
>>
>>106981509
>>106981336
i mean like it wasn't showing the image in comfyui and vice versa, its nodes weren't doing what they should according the demo video. i followed the instructions exactly and used the provided workflow. I get fed up with it and just gave up. I would be really fucking cool though to have something that just works as an interface between comfy and something like gimp.
>>
File: 1735764525252164.jpg (1.05 MB, 1416x2128)
1.05 MB
1.05 MB JPG
>>
File: image_00144_.jpg (554 KB, 1240x1672)
554 KB
554 KB JPG
>>106981515
>rescale cfg
I've had great results with this. It's just not something you want always on
>>
>>106981515
PLASMA LATENTS DUDE! it was proven that rescale cfg, cfg++ and other garbage are worse than normal cfg (there was a paper about this), but since it produces DIFFERENT results they think they hit it big, the AUTEURS that we have here LMAO, fucking copers
>>
>>106981617
>there was a paper about this
link?
>>
>>106981617
>>106981638
Yeah bro don't trust your eyes, trust the paper
>>
File: image_00152_.jpg (594 KB, 1240x1672)
594 KB
594 KB JPG
>>
2girls when?
>>
Generated image attached to snarky comment when?
>>
Generated image next to offhand reply to snarky comment when?
>>
File: image_00114_.jpg (652 KB, 1240x1672)
652 KB
652 KB JPG
>>
File: image_00128_.jpg (543 KB, 1240x1672)
543 KB
543 KB JPG
So few people have this hobby, why even care. I don't get it.
>>
https://www.reddit.com/r/aivideo/comments/1odt02a/wonderland_wow/

When will local be on this level?
>>
>>106981756
Sad and desperate
>>
File: image_00100_.jpg (435 KB, 1240x1672)
435 KB
435 KB JPG
>>
>>106981572
that's really cool
>>
>>106981756
they just stay in the same position, it's not natural at all
>>
Any local alternatives to ElevenLabs?
>>
>>106981867
You can rent indians locally on fiverr
>>
>>106981867
Vibevoice, Chatterbox
>>
>>106981870
keek
>>
What are everyone's favorite checkpoints these days?
>>
>>106981922
Seedream 4
>>
>>106981922
NoobAI v-pred and Chroma variants
>>
File: It is me, Sam Altman.png (781 KB, 761x760)
781 KB
781 KB PNG
>>106981922
Sora 2
>>
>>106981922
Chroma is the only answer.
>>
how much faster is 5090? thinking about upgrading from 5070ti
>>
>>106981984
you'll be more happy about the fact it's a 32gb card, with that amount of space you won't have to cope anymore with wan and shit
>>
what version do anon use for chroma now? v50?
>>
>>106981998
retard
>>
>>106981724
>>106981778
Nice style, what model and lora is it?
>>
>>106981922
Checkpoints have different uses for me.
When I make an image, I can use up to 4-5 different ones.
Upscaling and inpainting is where it's at.
>>
>>106981998
HD version and various mixes

>>106982013
Chroma-DC-2K-T2-SL4 + Digital brushes lora I trained
>>
>>106982012
elaborate
inb4 no reply
>>
>>106982035
no need. if you can't understand what version is the most used you don't need to know.
>>
>>106981932
>>106981960
Post something impressive that makes me buy into chroma hype.
>>106981931
What's so great about this? Looks like typical sdxl slop I see everywhere.
>>
File: 3859208437.png (1.57 MB, 896x1152)
1.57 MB
1.57 MB PNG
>>106982024
Nice, it actually doesn't look like Flux.
>>
File: file.png (72 KB, 916x448)
72 KB
72 KB PNG
>>106981638
man it was about a new CFG replacement method, it was posted here
it compared CFG, CFG++, RescaleCFG and their method.
In all the benchs (FID/CLIP/etc...) CFG was still ranked higher than the other meme cfgs.
oh wait found it:
https://arxiv.org/html/2508.12880v2
>>
>>106982070
let's not forget that they always choose only the benchmarks that shows good results, so maybe it's the case on that one also
>>
>>106982095
maybe you're right, and the evaluations are still synthetic at the end of the day (which means they can be benchmaxxed to death).
This is interesting because only their stochastic guidance was benchmaxxed for their tests, so we can ssume that all the other guidances we see are not touched. in the pdf they also have comparisons with pictures, I've just found this really interesting and at the end of the day, I truly think that messing with guidance is mostly snakeoil
>>
can wan do proper real life cartoon crossover stuff? just had a random thought of making a lora using roger rabbit, space jam, etc
>>
>>106981984
enjoy your housefire
>>
>>106982107
>I truly think that messing with guidance is mostly snakeoil
which is really surprising to me, what are the odds that CFG is the optimal guidance? it was literally the first one that existed lmao
>>
>>106982123
Pony v7 is pretty good at that.
>>
File: IMG_20251023_133544.png (1.47 MB, 832x1170)
1.47 MB
1.47 MB PNG
>>
File: 1749542545154802.mp4 (1.27 MB, 1234x1080)
1.27 MB
1.27 MB MP4
babe wake up, bytedance released a finetune of wan
https://huggingface.co/ByteDance/Video-As-Prompt-CogVideoX-5B
>Video-As-Prompt
that's... interesting...
>>
>>106982166
This looks like shit. I'm going back to bed.
>>
>>106982166
https://huggingface.co/ByteDance/Video-As-Prompt-Wan2.1-14B
Wan 2.1 again though
>>
>>106982166
>>106982174
https://youtu.be/S3zpLIMOU4c?t=12
so it's like wan animate but with additional effects?
>>
>>106982166
what's the point when Sora2 does this infinitely better?
>>
>>106982166
>cog
I forgot about that one, need to add it to my failbake list alongside hidream
>>
>>106982196
kek
>>
>>106982166
why did they make the reference videos semi transparent? are they retarded?
>>
>another bytedance scrapslop release that nobody will use
Looks like another batch of interns failed to make it onto the seed team
>>
>>106982239
this
>>
>>106982239
Let them shill here. They don't have anywhere else to advertise it.
>>
>>106981043
Good job keeping the asses on-model
>>
>>106982288
there will probably at least be a “comfy wen???” reddit post about it before it gets completely forgotten next week
>>
>>106982166
comfy wen???
>>
>>106982070
I hate going through these papers. Sure, you can make even a small dick look huge on paper if you measure from the asshole.
>>
>>106981043
lora?
>>
Is it possible to have a lora and checkpoint gallery in comfy, like you can with forge?
>>
>>106982520
there might be some nodes and I think swarmui does something like that
>>
Anime 1gril obsession = Male delayed development Doll Collecting.
>>
>>106979255
so which is the lora that will stop color shift?
>>
>>106982560
Any of them supposedly, depending on the type of video you gen.
Some are trained for one long video, some are trained for different transitions etc.
>>
>post locally genned video
>janny warns me

i guess it was too realistic?
>>
>>106980758

>https://github.com/stduhpf/ComfyUI-WanMoeKSampler
I used that for a while but it makes nonsensical high/low step count if you change the shift value too much so I got back to bongmath nodes.

>https://github.com/ChenDarYen/ComfyUI-NAG
Basic stuff with wan at cfg 1.

>https://github.com/Extraltodeus/Skimmed_CFG
I'm testing that in the first step in high noise I gen without the lightx2v lora, allowing me to go basically very high cfg, I seem to not have any slow motion issues thanks to that. Then after that I go NAG with cfg 1 + lightx2v.
>>
>>106982647
stop posting scat in a blue general
>>
>been genning all day

Bros, it's 28c in my office with window open and it's 12c outside..
>>
>>
>>
>>
>>106982697
Powerlimit to 75%. Open case.
>>
>>
>>106982715
Where's the girl?
>>
>>
>>106982697
>window open and it's 12c outside..
what gpu are you genning with? 4x5090?
just opening the windows when it's cold should be enough
>>
>>106982727
Lost for now
>>
>>
>>106982793
cute freckles
>>
>>106982809
Agreed
>>
>>
>>
>>
File: 00006-66246357.png (3.12 MB, 1280x1600)
3.12 MB
3.12 MB PNG
>>
Good morning anons, I have a couple of questions, if you would indulge me.

What is NAG?

Another thing: CFG on low noise in Wan2.2, I've heard 1.0 ignores negative prompt, but 1.1 don't. I've been using 1.1, but I don't know if it's the optimal.
>>
>>106982192
>Why doing open and verifiable research at all? Closed SaaS labs mogs anyway bro
Retard
>>
>>106982879
NAG is a way to guide your positive prompt away from the negative when using CFG=1. Normally CFG causes the result of model(positive) + (cfg * model(negative)) - model(negative). When CFG=1 it leaves you with just model(positive), so it's faster to just inference positive, instead of a negative too. NAG tries to give you > 1 CFG control without incurring the higher inference cost.. but I don't think it works very well desu
>>
>>
>>
>>
>>
>>106982715
existential angst in a hostile universe.
>>
>>106982697
check your memory temps and shit yourself
>>
>>
>>106982879
>I've heard 1.0 ignores negative prompt, but 1.1 don't
yes but it also doubles gen time because it has to generate a negative video. you might as well increase cfg if you're gonna use it because 2.0 or 3.0 vs 1.1 is not gonna make a difference in gen time but setting it to 1.0 will
>CFG on low noise in Wan2.2
using cfg on high noise would make more sense because high noise is responsible for establishing the base motion of the video. low noise just fills in the details
>>
blessed anons, could someone share one of those vace workflows that connects videos seamlessly by using ending frames from the first video as reference?
>>
>>106983085
>yes but it also doubles gen time because it has to generate a negative video.

What the fuck, why have I never heard this before?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.