[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Discussion of Free and Open Source Text-to-Image/Video Models and UI

Prev: >>106481665

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://rentry.org/wan22ldgguide
https://github.com/Wan-Video
https://alidocs.dingtalk.com/i/nodes/EpGBa2Lm8aZxe5myC99MelA2WgN7R35y

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Neta Lumina
https://huggingface.co/neta-art/Neta-Lumina
https://civitai.com/models/1790792?modelVersionId=2122326
https://neta-lumina-style.tz03.xyz/

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbours
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
>>106488620
There is no "version." The model is set default to 16fps/5sec. And people just interpolate to 32. Wan 2.2 can handle raw 20/24fps or maybe even more but you pay with more vram.
>>
>>106488626

Ok. Ive been using comfyui for a while. Works great. Just one thing I need though.

Where can I find a checkpoint or Lora that focuses on hot bbc pornstars like Selah Rain or Anastasia Lux or Remi Ferdinand or Mz Dani?

Really dying to create an image of all them getting facials by big black cocks.

Gracias
>>
>>106488626
Delete ComfyUI from OP
>>
The Gradio clan is fractured. They must join arms to take on the Comfy Menace.
>>
>>106488646
I know what you said, it's called hyperbole. And having custom nodes is not even close to have 50 different venvs.
>>
File: WanVideo2_2_I2V_00323.webm (309 KB, 1248x720)
309 KB
309 KB WEBM
>>
File: file.png (201 KB, 333x330)
201 KB
201 KB PNG
>>
>>106488694
Sweet Jebus it's alive
>>
>>106488670
If trials don't kill him, lung cancer will
>>
How can one apply Bayesian thinking to image generation
>>
>>106488649
Delete yourself from life.
>>
>>106488648
So no Maria bose fans around, eh?
>>
File: 1755661824986883.jpg (570 KB, 1416x2120)
570 KB
570 KB JPG
>>106488336
coincidental
>>
>>106488726
You deleted your penis from life.
>>
File: 1673502756417396.png (587 KB, 500x666)
587 KB
587 KB PNG
anyone know how to consistantly prompt facial features in flux/chroma? it still loves the same face, even without use of woman/lady etc. things like words for facial features; i try to prooompt different face shapes and they all end up the same face 25% of the time, and the rest of the time it's standard ai face. actually prompt adherance in general seems to be pretty fucking random desu.
>>
File: IMG_4744.jpg (689 KB, 1179x1243)
689 KB
689 KB JPG
>>106488987
>marx
delete your 4chan account
>>
File: WanVideo2_2_I2V_00324.webm (2.49 MB, 1248x720)
2.49 MB
2.49 MB WEBM
>>
>>106489084
saar...
>>
>>106488649
why?
>>
>>106489084
lmao
>>
>>106489152
what a MASSIVE faggot
>>
File: blondie00.mp4 (3.6 MB, 720x648)
3.6 MB
3.6 MB MP4
>>106488626
blessed thread of frenzone ;3
>>
>nsfw tentacle janny in collage
;c
>>
>>106488630
>Wan 2.2 can handle raw 20/24fps
How would you even control that? Doesn't adding more frames just make a longer video?
>>
So nothing really new after flux? Is local kill?
>>
File: hmmmmmmmmmm.jpg (208 KB, 930x710)
208 KB
208 KB JPG
THATS ALOTTA GENERALS!!
>>
>>106489564
total frames in empty latent node (default is 81) divided by framerate in the video node (default 16) + 1 frame as the first. So 121 for 5sec/24fps
>>
pathetic
>>
>>106489589
Wouldn't that result in a sped up video?
>>
>>106489572
maybe we can just delete ldg and leave the others?
>>
>>106488662
>no need to generate these. they're all on Twitter and they post amazing shit
I took the original image from X (thats why I know thats a tranny) and it was a selfie with phone covering face. I used QIE to put phone away and hands down.
Btw do you know why everyone single one of them love maimai?
>>
>>106489084
>>106489091
literally NOT ai
do you have to shit up the general with the news\real-world posting every thread?!!? HUH?!?!?!?!?
>>
>>106488987
I consider Flux and Chroma diverse enough with faces. Other than the infamous flux chin, I get good results.
Prompting for specific shapes of faces simply like that cannot work.
It cannot be helped if the training didn't care that much for shapes of faces, some tokens like bushy eyebrows or square jaws might work; I'd suggest a Lora with the type of face you want to get at low strength to preserve some diversity.
Or try specifying age or nationality to move the results.
>>
How did flux chin even become a thing anyway? Like what kind of distillation methods did they use that such a distinctive feature would basically contaminate every face.
>>
How long are we cursed to only be able to prompt 5-7 second videos? I want to be able to prompt at least a singular minute in one go without having to stitch them together.
>>
>>106489850
not having your tentacle girl change positions every 5 seconds
>>
File: ComfyUI_00084_.mp4 (1.22 MB, 592x816)
1.22 MB
1.22 MB MP4
>>
>>106489867
Actually awful.
>>
>>106489891
>>106489867
at 0.69x speed its watchable i guess ;c
>neg: mouth open, talking
>>
Does dataset size incrase vram requirements for training?
>>
4chan will strip the workflow data in the video when you upload it?
>>
>>106489900
nop, atleast not in the scales i've done (30-150 image loras). i dunno about full finetunes with large datasets
>>
>>106489941
No. We all just upload to catbox for the fun of it.
>>
>>106489900
No the number of parameters does.
>>
>>106489944
I have some old SDXL datasets with 200-500 pics each that I'd want to retrain for chroma. And one has 900 or so
>>
>>106489952
You'll likely have to train for longer on datasets of that size. Maybe prune the dataset to see if it even needs to be that big.
>>
>>106489952
it shouldn't matter then for that scale besides speed. are you doing a full finetune? unless you're training multiple concepts for a high rank lora i doubt you need that many images
>>
What do you guys use to train loras?
>>
File: ComfyUI_00093_.mp4 (795 KB, 592x816)
795 KB
795 KB MP4
>>
File: its_over_123.gif (1.03 MB, 500x500)
1.03 MB
1.03 MB GIF
Ok, the captioning models in Onetrainer can't handle obscure fetish stuff and it keeps tagging diapers as shorts or underwear or ignoring it. What's an external local tagging model that has no problems with freak stuff?
>>
File: AnimateDiff_00285.mp4 (1.92 MB, 720x720)
1.92 MB
1.92 MB MP4
>>106489850
>in one go

Sure, I also think that 5 seconds is too short, but I think it would be better to have a segmented generation, perhaps with segment length of your choice,but glued together in a natural way without imperfections. More than anything because I'm sure that if a whole minute were generated, at the end of the minute the generation would take an absurd turn
>>
>>106490009
sd-scripts. if you want a gui you can use this fork of easy scripts but idk if it supports chroma/lumina
>https://github.com/67372a/LoRA_Easy_Training_Scripts
>>
>>106490033
i wanted to go inside her mouth
>>
>>106490040
reconsider your goals
>>
>>106490076
I literally started doing AI because of this bruh.
>>
>>106486313
thanks anon
installed everything but it just segfaults after trying a gen (segfaults after init of the models as far as i can tell)
oh well
>>
File: ComfyUI_16696.png (2.8 MB, 1200x1600)
2.8 MB
2.8 MB PNG
>>106488987
Try prompting the age or features (chubby cheeks, large nose, etc) or even negging out facial features or ethnic features. Negging ethnicities can help a lot because of the heavy Korean influence in most base datasets.

>>106489821
Sometimes the model can latch on to something innocuous during training (I had a problem with lanyards many LoRAs ago), BFL probably didn't catch it because they were focused on other things (coherency, etc)
>>
So am I understanding right that negative prompts in Chroma need to be written in natural language as well?
>>
>>106490040
>local tagging model that has no problems with freak stuff?
Your eyes :)

>>106490009
I use diffusion-pipe, I think it works fine for single GPU but it really is a multi gpu solution. Supports most models even *sighs heavily* chroma.
>>
File: IMG_4742.jpg (152 KB, 824x737)
152 KB
152 KB JPG
>>106489568
>is local kill
kek
>>
>>106490192
Nuclear holocaust can't come soon enough
>>
>>106490042
>but I think it would be better to have a segmented generation
I'd prefer that, I just wanted to bitch about it. The progress we've had with videogenning is ridiculous so maybe having coherent longform videos without it going into eldritch territory will be possible in a few years. I'd be ok with at least 30 seconds.
>>
File: WanVideo2_2_I2V_00327.webm (2.55 MB, 1248x720)
2.55 MB
2.55 MB WEBM
>>
>>106489234
>>106489528
>>106489896
fuck off to your tranny circlejerk thread, subhuman tripnigger
>>
>>106490192
What site is that and what did you try to prompt?
>>
is there a way to have FaceDetailer skip blurry faces? they are 99% of the time not worth detailing and are not the main subject
>>
>>106490340
I don't use face detailer because frankly that's SDXL vramlet shit, but isn't there an option where you can specify the certainty threshold of what a face is and it will skip it? I'm sure there's a balance you can set where it will only detail the very obvious faces and skip the blurry ones.
>>
>>106490340
just learn to inpaint manually
>>
Do any of you use Adobe Lightroom? I'm thinking of getting it for easy(?) upscaling and for a replacement for lama/IO cleaner. Seems like it would be good but I wanted some opinions first.
Is there somewhere else I could ask this? I guess I could ask AI kek.
>>
>>106490361
stfu useless nigga
>>
>>106490369
I'm ignorant of lightroom. What does it do that I can't manually set up a workflow for in comfy?
>>
>>106490373
im serious. it'll save you more time in the long run, face detailer is too inaccurate. just learn to inpaint
>>
File: ComfyUI_00112_.mp4 (875 KB, 592x816)
875 KB
875 KB MP4
AI is just terrible with multiple subjects
>>
>>106490382
why do you assume I don't already know and have always manually inpaint and am experimenting with FaceDetailer? bitch ass nigga always butting in with stupid comments fuck off
>>
>>106490394
then there is no way i am aware of to manually skip faces with face detailer
>>
>>106490353
using bbox I wasn't finding that sweet spot, it either found 4 faces or none. tried with segm and at least with the test image it seems to work. thanks
>>
>>106490374
>comfy
never speak to me again
>>
>>106490390
This is awesome, do more
>>
>one lora completely fucks up the face details but gets the action right
>the other lora maintains the face details but fucks up the action
this sucks
>>
>>106490450
use both and/or do a hiresfix pass with different loras
>>
>>106490394
>bitch ass nigga
I cringe.
>>
File: ComfyUI_00115_.mp4 (1001 KB, 592x816)
1001 KB
1001 KB MP4
>>106490449
Most of the time they are not doing much. I think the AI is confused with so many subject
>>
I hate ComfyUI
>>
File: ComfyUI_WAN2.2__00002.mp4 (2.52 MB, 680x1016)
2.52 MB
2.52 MB MP4
>>
>>106489572
>assblasted trani forks the general just so they can add tranistudio
>>
>>106490669
how does one group of autists wield so much power. the cabal must be stopped.
>>
File: AnimateDiff_00293.mp4 (3.86 MB, 576x1024)
3.86 MB
3.86 MB MP4
>>
>>106490737
kek
>>
>>106489603
Yes.
>>
File: ComfyUI_WAN2.2__00007.mp4 (3.58 MB, 656x1048)
3.58 MB
3.58 MB MP4
>>106490618
>>
Is there anything more epic than wasting whole day of training because of wrong settings. Adamw8bit is my new best friend.
>>
Memo to myself, always try out every model, don't have any prejudices.
“Vibevoice release, podcast stuff, mh, i dont need it” then i read thread that microsoft has deleted the 7b again.
So I download it and give it a chance.
Now I almost missed a model that can generate porn audio
kek
>>
>>106489850
>>106490042
Honestly the only thing segmented generation is missing is to take into account preceding. For example if we could feed the last x latent frames, it would be so much easier.
>>
>>106490776
Is it hard to set up?
>>
>>106490782
There are comfy nodes plug&play. just check github
>>
>>106490776
> a model that can generate porn audio
Can it? Example catbox?
>>
File: AnimateDiff_00291.mp4 (3.3 MB, 720x720)
3.3 MB
3.3 MB MP4
>>106490219
>The progress we've had with videogenning is ridiculous so maybe having coherent longform videos without it going into eldritch territory will be possible in a few years.

Well yes, just think of the quality jump from Wan 2.1 to 2.2, they will certainly implement further improvements to the prompt, for example the possibility of starting a certain action at some point of the movie (for example, in the third second, the character begins to do x , then stops it and at the 6th second does y.. etc.)
>>
>>106490776
Is it actually that good? I've tuned out of voice stuff because 99% of the time it's just... okay.
>>
>>106490791
I sit on the toilet stoned and sit here for a while. So someone else should make the effort.
Yes, it can generate porn with voice cloning - it adapts the output to the text content. If you let it speak sexual content, the voice becomes erotic, mh and ahs are emphasized differently.
They must have trained on nsfw content. However, since it is a diffusion model, the result or the emphasis can vary greatly.
It can also use languages other than English and Chinese, depending on the seed perfect or broken pronunciation.

I can only recommend everyone to try it out. You are missing out.
>>
File: DS1 Baby Skeletons.webm (2.73 MB, 1280x720)
2.73 MB
2.73 MB WEBM
>>106490737
Why is this undead dancing with skeletons from the Catacombs?
>>
File: Qwan_00003_.jpg (667 KB, 2976x1984)
667 KB
667 KB JPG
Trying some silly stuff with long-ass prompts, referring to the subjects as Woman1 and Woman2. Seems to work decently well, there's some bleed between them, though.
Still pretty good for not doing anything besides prompting.
>>
File: AnimateDiff_00107.mp4 (1.5 MB, 720x720)
1.5 MB
1.5 MB MP4
vidrel made with wan2.1 btw

>>106490794
> these shaved waxed legs
>>
>>106490852
man youre finally back, mind sharing a catbox?
>>
>>106490852
The only thing I like more than 1girls are 2girls
>>
>>106490834
No plaps?
>>
>>106488626
Add this to OP, Forge:
----------From Panchovix--------:

-ReForge2dendev: https://github.com/Panchovix/stable-diffusion-webui-reForge/tree/newforge_dendev
-ReForge2: https://github.com/Panchovix/stable-diffusion-webui-reForge/tree/newmain_newforge

----------From DenOfEquity--------:
-ersatzForge: https://github.com/DenOfEquity/ersatzForge

----------From Haoming02--------:
-NeoForge: https://github.com/Haoming02/sd-webui-forge-classic/tree/neo

----------From lllyasviel--------:
-Legacy Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
>>
>>106490898
they're already there
>>
File: AnimateDiff_00108.mp4 (1.78 MB, 720x912)
1.78 MB
1.78 MB MP4
>>106490856
Naaah I made that animation and I did it with 2.2

Here is an alternative gen that I made in a different resolution, and is made with 2.2

I still remember having to generate 5 or more alternatives with 2.1, just to choose a decent. 2.2 reduced the need to 1 or 2, the results are much more consistent
>>
>>106490898
>-ersatzForge: https://github.com/DenOfEquity/ersatzForge
in what reality do you include a 12 star fork of a rando there?
>>
File: ComfyUI_00204_.png (2.54 MB, 1024x1536)
2.54 MB
2.54 MB PNG
>>
File: Qwan_00004_.jpg (726 KB, 2856x1896)
726 KB
726 KB JPG
>>106490877
My workflow is full of custom nodes I wrote, but it's pretty much based on this concept:
https://civitai.com/models/1866759/qwen-image-modular-wf?modelVersionId=2124985
You're going to need a lot of VRAM or DRAM for it, but the result should be the same.


First pass QwenImage, then into Wan (Ultimate SD Upscale).
res_2s/bong_tangent for both.
Prompt: https://pastebin.com/3vC51Rck
>>
File: ComfyUI_00205_.png (2.29 MB, 1024x1536)
2.29 MB
2.29 MB PNG
>>
>>106490936
ReForge2 dendev is bassed on this fork, and this Fork it is updated daily and has all the ultimate features
>>
File: workflow_flux.png (919 KB, 2132x1590)
919 KB
919 KB PNG
How long are flux/chroma, or in general, natural language captions suppossed to be? Some example WF I found had walls of text. Should I limit it in the system prompt or is this fine?
>>
File: ComfyUI_00206_.png (2.06 MB, 1024x1536)
2.06 MB
2.06 MB PNG
>>106490969
With Disco Elysium from anon
>>
>>106490947
>>106490969
These are aesthetically pleasing.
>>
>>106490898
>this is what mental illness looks like
>>
File: Qwan_00005_.jpg (764 KB, 2856x1896)
764 KB
764 KB JPG
>>106490990
Was that a Chroma LoRA? Looks cool. Both base image and LoRA, that is.
>>
>>106490485
Imagine this, but in VR
One day I will be able to generate hour long videos like this and just imagine myself being there
Just need to add the smell and I can die happy
>>
>>106490991
ty

>>106491000
Yup anon posted the other day https://civitai.com/models/1927225/
>>
man, changing the resolution even slightly in Wan is just such a crazy quality difference. It's not like normal image gen where you can get away with lower res. It increases the god damn intelligence of the animation, massively. You'd think steps are supposed to be responsible for that...
>>
>>106490951
thanks for the insight, I shall give it a whirl
>>
>>106490777
>For example if we could feed the last x latent frames, it would be so much easier.
yeah I find it odd that we've got First Frame and Last Frame inputs, but not First Frame, Second Frame, Third Frame inputs, etc, surely it shouldn't be that difficult? It's the same thing isn't it?
>>
>>106491003
Why would you want to imagine being stuck in a car with a bunch of dirty normie tik tok bitches? That would be a fucking nightmare.
>>
>>106491139
>Why would you want to imagine being stuck in a car with a bunch of dirty normie tik tok bitches? That would be a fucking nightmare.

Maybe this is the place for you >>106484063
>>
>>106491147
I hate anime. Those girls just look disgusting and obnoxious.
>>
>>106490987
Lodestone's own captions are even bigger. Some of his examples are outright autistic.
>>
>>106490755
It did great maintaining style throughout the entire clip.
>>
File: 1743326950204.png (1.13 MB, 1220x1376)
1.13 MB
1.13 MB PNG
>>106490987
If they don't tell you, you can assume it is on the shorter side. The longest I've seen is Lumina Image 2.0's captions. Note this won't be the case for any of its finetunes which most assuredly aren't using its format to caption. Really sad they never open sourced it even if you can' get 18+ stuff with it. It explains why Lumina 2.0 is so good at concepts and etc.
>>
>>106491222
Well adding *retain style* in the prompt has always given me good results
>>
File: IMG_20250905_174320.jpg (1.96 MB, 1800x2229)
1.96 MB
1.96 MB JPG
>>106491223
Here are lodestone's
>>
>>106491253
Doesn't this go over the T5xxl limit?
>>
File: 1753646874765941.mp4 (2.48 MB, 1280x720)
2.48 MB
2.48 MB MP4
Did they solve the style transfer?
https://xcancel.com/ideogram_ai/status/1963648390530830387#m
>>
>>106491310
>example style is hodgepodge slop
every time
>>
>>106491287
Yep, context limit of 512 which is why it's dumb. Lumina uses Gemma which is 8192.
>>
>>106491336
>Lumina
post tangible workflow or KYS scammer
>>
>>106491341
meds?
https://files.catbox.moe/0fnemj.json
>>
>>106491341
Schizo elsewhere retard, no one said you can't have tech discussions here and need to only post images or workflows.
>>
>>106491351
I schizo where I want, you're not my boss to tell me where to schizo.
>>
>>106491348
>file not found
>>
Alright, I have a whole ass archive of a specific fetish and 32GB of sweet Nvidia VRAM, I need wise ninja scrolls on wan2.2 lora training. What to do, what's good, what's not good, captioning tips, etc. I've trained SDXL loras before, but that's through sd-scripts, never used diffusion-pipe. Also need a magical way to remove watermarks.
>>
>>106491372
>Also need a magical way to remove watermarks
qwen image edit
>>
>>106491420
QIE randomly zooms in the image though
>>
>>106491372
nano banano
>>
>>106491372
>>106491438
>nano banana
>refuses images about circumsized bananas
explain this google!
>>
Oh also, if all of my sources are from irl porn, does that translate well to trying to do anime animations, or does that not work well?
>>
File: AnimateDiff_00296.mp4 (2.55 MB, 720x1104)
2.55 MB
2.55 MB MP4
>>
>>106491446
nazi
>>
File: ComfyUI_00282_.png (2.03 MB, 1280x1600)
2.03 MB
2.03 MB PNG
>>106490951
Prompting is quite a pain, but thanks for the guide
>>
>>106491461
I think that with all the porn material that has been on the web throughout history, making porn loras must be the easiest thing in the world.
>>
>>106491425
resize the image so the resolution is divisible by 112, not like it matters if the picture is zoomed in a tiny bit anyway
>>
>>106490852
>tfw AI will bring us "live-action" manhwa adaptations
Can't wait for my villainesses to be depicted as korean qties as they should be.
>>
>>106491372
>Also need a magical way to remove watermarks.
After much testing, I've concluded that the most magical way to remove watermarks en masse from tons of images is with Florence2 bboxes paired with MAT fast inpaint from Acly inpaint nodes (only works on 512x512 cutouts, so cut out around bbox before inpainting). Florence2 is a bit of a journey to install, it has some conflicts with current comfy requirements and needs you to downgrade transformers (or upgrade to latest version. Regardless, the version comfy used to install was incompatible). I can catbox a workflow with said caveat that it's not going to work out of the box and would require wrangling.
>>
File: joycap.png (52 KB, 905x473)
52 KB
52 KB PNG
Also for lora training what to pick? I think ticking the ambiguous language was mandatory, right? It's for fetish goonshit.
>>
>>106491597
>using comfy for training
lmao what a noob mistake
>>
>>106491476
Yum-Eeeee!
>>
>>106491605
Just for captioning
>>
>>106491597
Yes but what's the fetish?
>>
>>106491669
diapers and ageplay
>>
Clowness anon, any progress?
>>
>>106490856
>>106490934
>video made with wan
>filename 'animatediff'
Are you trolling?
>>
>>106491595
https://github.com/jferments/watermark_remover
>>
>>106491679
Pretty sure it's the default output name for one of those third party node solutions, yes, this is clearly Wan
>>
>>106491597
what base model are you using?
>>
>>106491735
Chroma
>>
>>106491748
Chroma training dataset was captured using Gemini, so look at examples from that and try to get JoyCaption to mimic those
>>
>>106491717
>Simple-LaMa for inpainting with 256 resolution
Still I think a yolo node would be faster than using the big ass florence2 for a dataset + MAT fast inpaint
>>
>>106491787
you're right. installing both a third party node and a graphical UI for image generation models, launching a server & loading the node is a lot easier than invoking a python script
>>
>>106491835
A testament, to my Glory....
>>
>>106491652
you do know there are online instances of joycaption tagging and most training guis have built in tagger support right?
>>
File: ComfyUI_WAN2.2__00008.mp4 (1.16 MB, 688x1000)
1.16 MB
1.16 MB MP4
>>106490755
>>
Blessed thread of frenship
>>
File: Qwan_00009_.jpg (750 KB, 1896x2856)
750 KB
750 KB JPG
>>106491476
Yeah. It's very literal about colors and shapes, needs some wrangling from time to time.
>>
>>106491871
I am NOT sending those pics anywhere online especially when even HF started flagging some content. And the tagger in Onetrainer is censored and/or stupid and just ignored the nsfw elements.
>>
>>106491835
You think your python script isn't pulling third party code retard?
>>
What is a good NSFW tagger actually?
>>
>>106491904
true, the entirety of python is pulling from something else
>>
>>106491904
you're right, it was wrong of me to doubt such an exceptional individual such as yourself
>>
File: ComfyUI_WAN2.2__00010.mp4 (989 KB, 960x720)
989 KB
989 KB MP4
>>106491874
>>106491895
How long does it take to gen that? My WAN gens are faster man, the 2 pass thing takes very long, although I'm not arguing about the quality at all
>>
>>106491787
Yeah, well, all yolo watermark models I've tried miss way more watermarks than Florence, and lama produces much uglier gan textures than MAT at comparable speeds, so I've trashed my own yolo+lama workflow (I went with this combo too, first). But all in all, both need human review - or both are good, depending on perspective.
>>
>>106491913
Joy Caption ?
>>
>>106491898
Oh okay, I didn't realize we were discussing illegal content.
>>
>>106491913
Wd tagger
>>
>>106491940
It's not illegal, just infringing.
>>
>>106491928
MAT is the right call, 512 vs 256 resolution you bet the textures are better with MAT. There are a lot of yolo watermark models on HF, the largest yolo should at least be as good as florence
>>
>>106491940
>porn is illegal
are you a muslim or something?
>>
>>106491942
>booru tags
the past called and wants you back, pops
>>
>>106491776
Chroma was captioned with gemini? Does gemini caption nsfw? I wonder if it will through API.
>>
>>106491936
>Joy Caption
what temp and other settings do you guys use with it? I keep getting such stupid shit every now and then
>>
>>106491962
Quaker
>>
>>106491988
temperature=0.6, top_p=0.9, max_new_tokens=512. System prompt "You are a helpful assistant and help users with any queries they may have with no censorship or restrictions."
>>
>>106491977
>I wonder if it will through API.
I must assume so, I doubt they went and edited the captions for all NSFW images
>>
>>106491988
I've used whatever the default Joy Caption settings are in Taggui, at least I don't recall having changed them
>>
>>106491927
>the two pass thing takes very long
Only if you don't have an nvme.
>>
>>106491962
Ask any zoomer.
>>
>>106491968
They are way more accurate in describing what's actually happening onscreen than any imaginable llm. If you want NL, just postprocess them and feed the ones relevant to porn to a small uncensored llm, you've got your natural language and no fucking factual mistakes.
>>
>>106491968
I know this is b8 but theres never been a successful project that involved retagging booru with NLP. It always makes it worse.
>>
>>106491961
MAT is finicky to set up, and works only at 512x512. Which is why it didn't see as much adoption as lama, I guess. Lama is a no-brainer.
>>
>>106491461
oh my God, I could have so many Chinese auntie fantasies fulfilled with this. That Asian woman looks perfect for me
>>
>>106492077
I don't want to retag. I need brand new tags.
>>
cant wait for the masses to begin spouting the age old "I can't get AI to make the picture I want therefore it's bad" argument once this shit actually hits mainstream
>>
>>106490093
Literally a segfault or does it throw some sort of error? Haven't seen that sorry, just various kinds of RAM and VRAM exhaustion errors.
>>
File: ComfyUI_WAN2.2__00014.mp4 (418 KB, 672x504)
418 KB
418 KB MP4
>>106491927
>>106492039
I do, guess the workflow could be optimized further, need to test some more. That was literally only my 2nd Qwen gen.
>>
>>106492108
>once this shit actually hits mainstream
Bro, it's been three years since SD first release.
>>
>>106492122
It's funny going back and read anon predictions from back then and see how much almost everyone was completely wrong.
>>
>>106492122
How many generate images daily or at least weekly do you think
>>
>>106492165
You tell me https://openrouter.ai/rankings#images
>>
>>106492148
Such as?
>>
>>106492177
Is that not a graph of total images? Not the same as asking "how many people..."
>>
>new model comes out with improved prompt adherence
>find out it can do specific stuff that the old model couldn't
>start prompting for even more specific stuff
>it can't do it
It's amazing how we can keep making steady progress yet there is still so much room for improvement.
>>
>>106492198
>Acktually
I accept your concession on your "until it hits mainstream"
>>
File: ComfyUI_WAN2.2_00002.mp4 (544 KB, 672x504)
544 KB
544 KB MP4
>>106492117
>>
>>106492203
All current open models still cannot do some stuff even dalle3 could do back in the day.
Try making subjects do the dab pose, the korean finger heart gesture even on Qwen, it fails
>>
>>106492188
- we won't use GPUs at all for gen and instead use specialized cards
- we would have perfect models locally

and many other random other ones with anons very sure of themselves
>>
>>106492122
>he thinks stable diffusion started it all
>>106492222
Doesn't change the fact that we have no hard numbers on daily users across the board cloud and local
>>
>>106492148
the most wrong crowd and opinion were ai doomers, it wont ever be better than this for sure, its a fad, the bubble will pop etc
>>
>>106492230
can i ask for the workflow for this?
>>
>>106492304
I really shouldn't entertain autists
>>
>how many gun owners are there in the world?
>heres a chart of bullets fired per day
>that doesnt answer my question
>i accept your concession
???
>>
File: 1730318350837201.png (5 KB, 214x101)
5 KB
5 KB PNG
>cumfart ui now doesnt only need a 140gb + tip pagefile to not crash on a 128gb ram system it also recently fucked memory management inside vram too
great
>>
>>106492321
>lateral thinking isn't my thing
I figured already
>>
>only 24GB
oof.....
>>
>when I lie on the internet
>>
>>106492336
What are you running that needs that much memory?
>>
>>106492230
kek
>>
>>106492340
I'd ask you to elaborate on how you derived daily or weekly users but I know you'll respond with another ad hominem
>>
>>106492305
>local ai will be fully banned next year, mark my words!
>we will never have anything better than sd1.5!
>all jobs will disappear by next year!
>>
>>106492347
the vram fuckery is happening with basic noobai workflow that worked normally until recently, ram problems started since qwen image edit
>>
>>106492336
use another web ui, don't karen the thread
>>
>>106492368
no other ui supports all video gen features and optimizations, vramlet
>>
>>106492336
Have you updated to the fixed version? No way you need this for fucking noob
>>
File: ComfyUI_WAN2.2_00004.mp4 (503 KB, 672x504)
503 KB
503 KB MP4
>>106492230
>>106492310
its this civitai. com/models /1818841 /wan-22-workflow-t2v-i2v-t2i-kijai-wrapper

some modifications, broadly that.
>>
>>106492377
everything is the latest version, but it does seem like that vram fuckery fixed itself for all images after the first generated image finished
>>
>>106492294
>- we won't use GPUs at all for gen and instead use specialized cards
Maybe people said this in LLM threads, never saw anyone in imagegen threads claming gpus would be replaced
>>106492294
>- we would have perfect models locally
lol, not even cloud/saas models are "perfect". What we still don't have, though, are local models with vast knowledge in pop culture, celebrities etc. Dalle3 used to be amazing at these, it did feel it was trained at everything. Prior being censored or when jailbreaks worked, I remember people pointing out it even knew who AOC (the congresswoman) was
>>
File: ComfyUI_00457_.png (1.44 MB, 1328x1328)
1.44 MB
1.44 MB PNG
>>106492231
skill issue, just let llm write the prompt
>describe the image of a person doing a dab pose, focus on body part positioning, in 200 words
>>
>>106492357
Openrouter is giving you an estimate of around 38.7M images generated weekly. That's from startups or direct users since the main labs aren't displaying the number of gens. You can easily infer that all main labs gens combined >> Openrouter gens.
Generating images isn't like generating text where you can have agents consuming a lot of tokens without your supervision. Generating images isn't like generating text where you can have agents consuming a lot of tokens without your supervision. Besides, when generating images on an online service, you're either hit by hard limits if you get a subscription or the cost would go through the roof if you pay per request, that by design would limit the number of gens per person. At most 20-30 gens per person per day seems a proper estimate. That gives a few million people per week generating images. If that's not mainstream, I don't know what it is.
>>
>>106492321
Imagine keeping a shit diary instead a food diary. Might be a fun way to scare housewives with the calorie tracker app.
>what did you eat
>here's a pic of the turd
>>
>>106492435
nta but you're forgetting to account for the chinese labs mass genning to curate a large synthetic dataset for their next local sota model
>>
I am using Qwen-Image with Replicate. What is the prompt and negative_prompt to use to make the person in the image totally nude? They usually still keeping their underwear on.
>>
>>106492435
Imagine a world where the average person generated only 20 to 30 images a day. How do we account for those with intense autism as in the kind present ITT.
>>
>>106492498
>X-ray women
That takes me back
>>
>>106492498
Explicitly prompt for genital words mostly gets the underwear off, but the training data is completely devoid of genitals anyway so all you get is body horror that looks kinda like scrotum.
>>
File: chromasome.jpg (278 KB, 1376x1072)
278 KB
278 KB JPG
>>
>>106492383
Has anyone tried to animate scenes from the Berserk manga yet?
>>
File: rikka3.mp4 (1.54 MB, 720x1024)
1.54 MB
1.54 MB MP4
>>
Anyone finding wan2.2 high noise being uncensored? I clearly saw pussy in the high noise preview but once the low noise begins it immediately puts underwear on top.
Is there a way to utilize this?
>>
>>106492677
You have two options:
1. Get lucky
2. Use loras
>>
File: ComfyUI_temp_gurpb_00001_.png (3.19 MB, 1152x1152)
3.19 MB
3.19 MB PNG
Somehow I accidentally created a reddit sloppa
>>
>>106492614
looks like shit
>>
>$150000 later...
>>
File: ComfyUI_00151_.mp4 (694 KB, 592x816)
694 KB
694 KB MP4
>increasing image resolution or length increases the generation time exponentially
when will this be solved?
>>
>>106492698
brap
>>
>>106492747
Heat death of the universe
>>
>>106492747
NICE TITS
>>
>>106492747
wasn't radial attention supposed to fix this
>>
>>106492747
when you understand how math works
>>
>>106492403
>Just write 50 words to describe a single concept, bro
>>
>>106492403
>>106492912
this, I don't want to write a bible to describe something that can be said with a single word, that's dumb
>>
>>106492841
We need quantum deep networks in neuronal quantum chips.
Fuck math.
>>
>>106492747
whats the name of this semen demon
>>
>>106492971
Dave Mustaine
>>
Ah btw.
I just read that chatterbox multilingual was released today and now supports 23 languages.
>>
>>106492959
that's all math
>>
>>106493062
Holy shiet, ty
>>
>>106493062
hejsan homopojkar
>>
File: 1747054662507336.png (1.27 MB, 1104x1472)
1.27 MB
1.27 MB PNG
>>106492912
I don't know why that anon used an LLM, it worked with just "man doing the dab pose".
Being said llm enhancement is used by pretty much all the closed source models so not a bad tactic.
>>
File: Qwan_00013_.jpg (817 KB, 1896x2856)
817 KB
817 KB JPG
>>106492614
Heh. My Chroma generations of my prompt turned out exactly like that, with some absolute drag queens.
>>
What’s the current workflow to gen longer videos in wan 2.2 apart from just raising the frames? I managed to raise it to 144 frames for 9 secs but there’s probably a way to use the last frame to double the length no? Unless consistency would be an issue
>>
>>106493118
>llm enhancement is used by pretty much all the closed source models so not a bad tactic.
yeah fair, boomer prompting always improve prompt adherence so they probably all use this on their API models
>>
NeoForgeGODS I want to use Chroma, which things do I have to download and where to put them? And which settings
>>
>>106493173
Well let's be real, most of the chinamen women are complete dumpsterfires
>>
>>106493422
>>106493173
Forget it. I want to run Qwen. I hate the Chroma trannies. Chroma blends the sexes. It appears as if Lodestone didn't tag male and female. You can probably have a female with male feet dataset.
Disgusting.
>>
>>106493436
>Forget it. I want to run Qwen.
qwen is so fucking slopped though, fortunately there's some loras to fix that
https://civitai.com/models/1927710?modelVersionId=2181911
>>
>>106493436
>Chroma blends the sexes.
Did you forget to put "transexual, tranny, masculine, LGBT" in the negatives
>>
anons, I want to train a lora of this insta model. She's got such a unique look that it didn't work in 1.5 back then.. So whats the best model to train on for max versatility now...
xl? pony? chroma? wan?
>>
>>106493198
Seconding this. 100% its going to be some "take last frame" bullshit. Cant wait to get out of 5-8 second hell
>>
>>106493571
>Cant wait to get out of 5-8 second hell
just two more years and a minimum of 98gbs of vram
>>
File: Qwan_00017_.jpg (862 KB, 1896x2856)
862 KB
862 KB JPG
>>106493436
That would explain the manhands that it likes to give women, heh.
>>106493428
Dunno if I'd agree, saw a lot of real cuties during my time there.
>>
>>106493571
>Cant wait to get out of 5-8 second hell
good luck with that, you need a shit ton of memory to make long videos
>>
>>106491461
Can you make him hump her?
>>
>>106493571
Is 8s the max we can stretch on WAN?
>>
File: 79022355.png (1.77 MB, 1024x1024)
1.77 MB
1.77 MB PNG
>>106493581
>>106493623
Animatediff can do long (shitty) videos. Wonder if someone who is smart like take some of that technology and stick it in wan some how
>>
File: 1732955633242405.png (1.09 MB, 1397x1365)
1.09 MB
1.09 MB PNG
https://xcancel.com/bdsqlsz/status/1963984028476014841#m
there's more examples about that chinese model that will be (soon?) released locally
>Advantages: Native 2K output, default is high-definition result
>>
im gunna put anistudio in the next OP
>>
>>106493477
does that actually work
>>
>>106493198
you could have it so that it does a 5-9 second video and then have it auto-pickup the last frame with a secondary prompt that queues after the first clip finishes

you could technically automate a third and fourth rotation as well with a workflow.
>>
>>106490834
Nobody else has posted anything, could you post some sample audio?
>>
>>106493660
It can go longer but you have to have a shit ton of vram. Longest I can go is 8 secs on the all-in-slop wan 2.2/2.1 by phr00t
>>
>>106493173
what I would give for a full nsfw finetune of that model instead of the horrors chroma spits out by default
>>
>>106493669
>anime figurines as a test
stuff no western company will ever do
>>
>>106493669
inch resting. native 2k doesnt have as much pull anymore desu since it should be expected of modern models. but still cool.
>>
>>106493718
yep, western dogs are devoid of fun, china still has its sovl (and it's even more noble of china to use some Japanese anime references when you know that Japan is China's biggest rival)
>>
>>106493727
>native 2k doesnt have as much pull anymore desu since it should be expected of modern models.
is it? I have yet to see models that were trained on 2k, Chroma was trained at 512x resolution for the vast majority of its process
>>
>>106493705
>but you have to have a shit ton of vram
how much? the most i can get is 32 with the 5090.

>inb4 buy a 6000 blackwell
does anyone here even have one?
>>
>>106493718
Western companies would probably avoid cute women and hot models altogether, anime or not.
>>
File: Qwan_00019_.jpg (700 KB, 1896x2856)
700 KB
700 KB JPG
>>106493706
I don't really gen NSFW but man, that and getting rid of some of the slopped faces/backgrounds would honestly be my dream model. I still like it a lot, at least in terms of genning 'clean 1girl' stuff. For styles Chroma still wins out.
>>106493669
I don't know if I love the output.
There's some prompt bleeding on the box and the figure just looks weird in general, not really like a figurine would.
Still, looking forward to new toys.
>>
>>106493781
>does anyone here even have one?
one or two anons, I think one of them from their work and the other is rich enough to afford getting one
>>
>>106493788
>I don't really gen NSFW but man, that and getting rid of some of the slopped faces/backgrounds would honestly be my dream model. I still like it a lot, at least in terms of genning 'clean 1girl' stuff. For styles Chroma still wins out.
how much "bleed" is there when there are two characters?
chroma is awful at that
>>
>>106493788
>I don't really gen NSFW but man, that and getting rid of some of the slopped faces/backgrounds would honestly be my dream model
same, and it should be an edit model so that you can use any image input as a character reference, would be the perfect model
>>
>>106493669
the contrast on the left stinks of dpo sloppa
>>
>>106493062
which chatterbox? there's like 3 different webui's
>>
>>106493781
Some anons have one those. Even then, doesnt matter how long we can make it, wan's context is shit and it'll slop out and loose consistency on long gens. We're just waiting for radial attention or another technology that'll hopefully solve this issue
>>
>>106493816
yeah, it even has the manlet effect kek
>>
File: Qwan_00020_.jpg (728 KB, 1896x2856)
728 KB
728 KB JPG
>>106493804
See
>>106490852
>>106490951
>>106491000
There still is some bleed, but way less than I have seen in most models, and it's mostly small stuff like accessories (rings, make-up). Hair styles, poses and general clothing have usually been on point for me.

>Floating koi
It's magic, for sure.
>>
>>106493855
i want a pet koi so badly
>>
>>106493738
yeah bro illustrious 2.0 does it among others
>>
Anyone have a csv/txt to use with wildcards for all the Head expressions from here? https://tagexplorer.github.io/#/
>>
>>106493877
>illustrious 2.0 does it
I don't see the point to train a SDXL model at 2k resolution, its vae is so bad it's not worth it
>>
>>106493622
I had the cuties and those were the minority

>>106493816
Which is better, dpo or dmd slop?
>>
>>106493855
>There still is some bleed, but way less than I have seen in most models, and it's mostly small stuff like accessories (rings, make-up). Hair styles, poses and general clothing have usually been on point for me.
that's better than chroma, which can make a male female, change their hair or build etc
I need to check if I can use regional prompting with it I guess
>>
>>106493888
You don't train tho
>>
>>106493923
you don't train too
>>
>>106493622
>>106493893
You can find the cuties in the more advanced tier 1 cities, younger people flock to them from the villages. The hottest girls are always in demand, and they know it, so most of them try their luck there too.
>>
File: 29191232.mp4 (3.76 MB, 816x624)
3.76 MB
3.76 MB MP4
>>
File: qwenedit_00022_.png (1.73 MB, 1056x1584)
1.73 MB
1.73 MB PNG
>>106493669
Quick Qwen Edit test with shitty prompt and no second second pass or upscale.
Original image was cut off before the end of the skirt so that's all just what it implied. Kinda shortstacked.
>>106493893
Maybe it also depends on the region, I dunno. Only been near Wuhan.
>>
>>106493933
you don't train either
>>
add ahneestudio to next OP
>>
>>106493888
point is you gen at 2k so the vae doesn't screws up alot less details
>>
>>106493948
so the vae screws up alot less detail*
>>
did i miss out on the golden age of white men only imggen or did that never exist
>>
>>106493947
Fuck off fag. It will be added when it's WORTH using. OP is for the noobs
>>
>>106493959
the g in ldg isn't for gay
>>
File: slopppa.jpg (1.82 MB, 2048x1673)
1.82 MB
1.82 MB JPG
>>106493669
ouch
>>
>>106493974
retards will prefer left
>>
>>106493944
I explored chinamen east and taiwan from coast to coast. Very cool, can recommend
>>
File: character+pose attempt.png (1.01 MB, 905x1120)
1.01 MB
1.01 MB PNG
nano banana won
>>
>>106493974
oof, it's so slopped, even more than QIE, DOA
>>
>>106493974
yeah that's just doa unless it's better than qwen edit
>>
baker?
>>
>>106494002
it'll be doa anyway even if it's slightly better than QIE, since the Qwen fags hinted they'll make a QIE 2.0 at some point
https://xcancel.com/Alibaba_Qwen/status/1959172802029769203#m
>>
i'm maxing my 64gbs of ram with comfy, any of you run 128gb of ddr5 without any issues? i heard it can cause some booting problems on AMD but not on Intel
>>
>>106494019
i like it when the chinese rat race each other. local only wins. my nut is built on their egos.
>>
>>106494019
sadly it's shit at nsfw
>>
https://xcancel.com/elliotarledge/status/1963836335904768179#m
>Dropping support for older architectures like Pascal and Volta is painful for some, but it's a necessary evil to unlock the full potential of Blackwell and beyond.
VRAMLETS BTFO
>>
>>106494015
4min
>>
>>106494028
unfortunately they rat race at each other with slopped mememarks, if only they would understand that plastic skin looks bad, they're chinese but they're acting like South Koreans by wanting silicone plastic skin everywhere :(
>>
>>106494036
+300sec
>>
>>106494033
AAAAAIIIIIEE
>>
>>106494032
>for now
>>
>>106494033
is there a single person even prompting on 10xx series? most vramlets are at least 20xx, most commonly 3060
>>
>>106494083
I hope so anon
>>
File: Sillyanon_00293.mp4 (2.13 MB, 720x720)
2.13 MB
2.13 MB MP4
>>106491679
>video made with wan
>filename 'animatediff'

yup, used a frame interpolation node and the video combine node that had a default 'animediff' preamble, never changed it... until now
>>
>>106494085
P40fags
>>
>>106494102
>>106494102
>>106494102
>>
>>106494085
>>106494093
right in the feels anon...right in the feels
>>
>>106494111
ty 4 bake
>>
File: 1737642723206398.png (461 KB, 468x577)
461 KB
461 KB PNG
>>106494033
>CUDA 13.0 released a month ago
>gay faggot nigger good goy shitter checkmark pissfluencer "CUDA 13.0 JUST dropped"!!!
>>
>>106493982
Ok. From this comparison I know how they trained banana. Clever.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.