[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Infinite Vid Wait Room Edition

Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>106895887

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Neta Lumina
https://huggingface.co/neta-art/Neta-Lumina
https://civitai.com/models/1790792?modelVersionId=2203741
https://neta-lumina-style.tz03.xyz/

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
File: ComfyUI_06012_.png (1.1 MB, 1192x872)
1.1 MB
1.1 MB PNG
>>
>>
we're going to be using illustrious for the next 10 years at this point
>>
>>106904234
Hey, that's a famous jewish character of a dumb American.
>>
Blessed thread of frenship
>>
do people still use wan vace?
use case?
>>
>>106904277
>>
>>106904339
>i2v of girl you like
>reference photo of yourself
>????
>profit
>>
>>106904296
>we
Not everyone is into tranime artworks and 1girls exclusively
>>
File: ComfyUI_07321_.png (2.09 MB, 1152x1152)
2.09 MB
2.09 MB PNG
>>
Uh oh split thread i like it!
>>
>>106904399
wut
>>
>>106904407
put /ldg/ in the title faggot
>>
Can you guys post a funny?
>>
with epsilon scaling at 1.005

>>106904296
if someone makes a frankenstein version of noob with an LLM TE and better VAE, this is genuinely possible.
>>
File: ComfyUI_07328_.png (1.72 MB, 1152x1152)
1.72 MB
1.72 MB PNG
>>
>>106904430
with epsilon scaling off.
epsilon scaling is the clear winner in this comparison. I recommend turning epsilon scaling on always.
>>
File: ComfyUI_07330_.png (1.85 MB, 1152x1152)
1.85 MB
1.85 MB PNG
>>
File: ComfyUI_07340_.png (1.84 MB, 1152x1152)
1.84 MB
1.84 MB PNG
>106900267
I wonder if API having that annoying AI aesthetic is just them censoring it to prevent misuse, or if they suck at training the model. In ClosedAI's case given how good Dalle was and then Sora 2 dropped I'm leaning more towards censorship.
>>
File: 1753645738444570.gif (2.93 MB, 640x640)
2.93 MB
2.93 MB GIF
What's the "ideal" number of images for a lora training dataset? I have about 1000 high quality images for a couple of gravure idols (each) that I want to make. I've seen most people say 30-50 but most loras I've used are dogshit. Does a high amount of images/training time really not have much impact on the outcome of the lora?
>>
File: ComfyUI_07344_.png (2.23 MB, 1152x1152)
2.23 MB
2.23 MB PNG
>>106904562
>>106900267
>>
>>
File: ComfyUI_07345_.png (2.47 MB, 1152x1152)
2.47 MB
2.47 MB PNG
>>
File: ComfyUI_00006_.png (808 KB, 1024x1024)
808 KB
808 KB PNG
>>106904636
>>
File: ComfyUI_00036_.png (544 KB, 1024x1024)
544 KB
544 KB PNG
>>106904666
Adding NAG to my wf.

Is NAG dumb?
>>
>>
>>106904369
post one of your gens
>>
>>
File: ComfyUI_00037_.png (884 KB, 1024x1024)
884 KB
884 KB PNG
>>106904677
oink oink
>>
looking into training chroma with onetrainer, what model should i be using for captions? i assume blip2?

also not sure what to use as the initial caption/caption prefix. should i just use the name of the character or add a more natural sentence like "A photo of X"?
>>
File: ComfyUI_00038_.png (1.48 MB, 1024x1024)
1.48 MB
1.48 MB PNG
>>106904703
>>
>>106904688
>>106904698
she really let herself go after graduating from hogwarts
>>
File: 1729395857586534.jpg (150 KB, 1920x1080)
150 KB
150 KB JPG
>>106904218

I envy you people with 24gb+ of VRAM you can generate any degenerate shits you want.
>>
File: ComfyUI_00040_.png (1.48 MB, 1024x1024)
1.48 MB
1.48 MB PNG
>>106904739
>>
File: ComfyUI_00041_.png (1.52 MB, 1024x1024)
1.52 MB
1.52 MB PNG
>>106904818
16gb vram. 32gb+ of system ram. 2-10 minute image gens, depending on various factors.
>>
File: ComfyUI_07354_.png (1.94 MB, 1152x1152)
1.94 MB
1.94 MB PNG
>>
File: ComfyUI_07351_.png (1.68 MB, 1152x1152)
1.68 MB
1.68 MB PNG
>>
>>106904838
What about VidGen ? Thats the real shit right there. Anyone can do ImageGen nowadays
>>
Was there ever a consensus on the new lightx2 high noise lora for wan? I tried it and it seems to give much slower motion than old wan 2.1 lightx2 lora, are people running it at higher strength? Or did we just go back to old lightx2 once again?
>>
>>106904838
i'd blow my head off if i had to wait 10 minutes per image
>>
>>106904867
>never had dialup
>>
>>106904849
multiply by however many frames you need, I guess. I haven't played with WAN.
>>
File: ComfyUI_00044_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
>>106904838
>>
>>106904875
>back in my day
pick the scab on your aging head
>>
>>106904890
You should revere the powerful saints of technology.
>>
>>106904875

good luck running that engine until warning lights go down
>>
>>106904915
i revere the immortal god of 1girl
>>
>>
>>106904932
dialup didn't have lights. Usually.
>>
>lights
i gen in the dark
>>
File: ComfyUI_00047_.png (1.48 MB, 1024x1024)
1.48 MB
1.48 MB PNG
>>106904885
>>
File: ComfyUI_07364_.png (1.93 MB, 1152x1152)
1.93 MB
1.93 MB PNG
>>106904818
Buy an used 3090. Without it you really can't into this hobby.
>>
>>106904974
Every time I see a casino I think "taxes are too low".
>>
>>106904818
:(
I unironically started selling 3d smut rendered with CPU when covid hit and worked my way up, the first gpu I bought was a 2070, then a 3080 ti and finally a 4090
>>
>>
File: ComfyUI_00048_.png (1.63 MB, 1024x1024)
1.63 MB
1.63 MB PNG
>>106904972
animoo, just a copied prompt.
>>
>>106905006
I would def. fuck an older lady if she looked like that
>>
>>106904430
>>106904454
on base noob the effect is so much more pronounced, i have to lower it to something like 1.002
you should try it at ~0.999 or lower as well desu
>>
>>
What's the best wan2.2 i2v workflow right now? Something that has upscale, video extend and interpolation is what I'm looking for.
>>
>>
is 12GB VRAM and 32GB RAM good enough for NSFW grok tier shit?
>>
>>106904818
I envy they can train any shit.
>>
>>106905079
ask grok
>>
File: 1388362043058.gif (2.39 MB, 149x198)
2.39 MB
2.39 MB GIF
>>106905082
>>
File: ComfyUI_temp_mpreu_00026_.png (2.01 MB, 1320x1024)
2.01 MB
2.01 MB PNG
>>
File: ComfyUI_07366_.png (2.19 MB, 1152x1152)
2.19 MB
2.19 MB PNG
>>
>>106905079
with ram swapping and/or using lower quants it is doable. people gen videos with 8gb even
>>
>>106905092
Would
>>
File: ComfyUI_18967.png (2.32 MB, 1600x900)
2.32 MB
2.32 MB PNG
>>106904734
>blip2
Holy 2020 Batman! While more manual to use offline, I prefer Gemma 3 27b. JoyCaption (beta 3?) can hallucinate pretty bad and has trouble spelling/following orders. It's best to use a more refined vision model so you can guide the captions the way you see fit via the System Prompt/Prompt.

This is just my experience captioning thousands of images across multiple dataset/revisions though, there's no real single way to do things.
>>
>>106905079
I have 8gigs of vram and 32gig of system ram and I can generate a 5 second 832x480 video in 3 minutes without all the extras like upscaling, teacache shit and vram off-loading.
>>
File: QwenImg_00024_.png (1.44 MB, 1152x1440)
1.44 MB
1.44 MB PNG
>>106905006
I like your style friend
>>
>>106905116
film grain on this one is nice
>>
File: ComfyUI_temp_mpreu_00029_.png (2.02 MB, 1320x1024)
2.02 MB
2.02 MB PNG
>>
File: QwenImg_00018_.png (1.7 MB, 1152x1440)
1.7 MB
1.7 MB PNG
>>
File: ComfyUI_temp_mpreu_00030_.png (1.93 MB, 1320x1024)
1.93 MB
1.93 MB PNG
>>
File: ComfyUI_06016_.png (1.14 MB, 832x1248)
1.14 MB
1.14 MB PNG
>>
File: 1735610446418258.png (1.13 MB, 1360x768)
1.13 MB
1.13 MB PNG
man, qwen edit makes games even more realistic. like pokemon in france.
>>
>>106905174
der plastiCoomer
>>
File: 1749406917675219.png (179 KB, 2006x898)
179 KB
179 KB PNG
https://www.reddit.com/r/StableDiffusion/comments/1o7x4zr/in_anticipation_of_no_kings_day/
hmm sweety? we have to remind everyone that orange man bad on this local model subreddit m'ok?
>>
ok
>>
>>106905092
model + lora ?
>>
File: ComfyUI_07371_.png (1.83 MB, 1152x1152)
1.83 MB
1.83 MB PNG
>>
File: ComfyUI_07381_.png (1.9 MB, 1152x1152)
1.9 MB
1.9 MB PNG
>>
File: WanVid_00002.webm (462 KB, 720x960)
462 KB
462 KB WEBM
Congratulations the job is yours, just one more thing
>>
File: 1741733917228082.jpg (200 KB, 1024x1024)
200 KB
200 KB JPG
They're gonna discover that "AI slop" phenomenon is just your brain reacting to new information

then we're all gonna look retarded as it becomes a signed record of how brittle and insecure we are
>>
>>106905306
wtf does that mean?
>>
>>106905306
then what about schizo slop
>>
>>106905306
Backwards.

ai slop means that something looks extremely average, somehow.
>>
>>106905315
reacting caustically to AI generated content is just rooted in not being able to process it fast and intelligent enough
>>
>>106905306
I think "AI Slop" is the phenomenon of recognising that something was artificially generated, and thus failing the test of mimmicking realism
>>
>>106905321
false

AI generated content looks different to the brain, not familiar
>>
Wait what, they released the low version of the new 2.2 i2v moe distill as well?
>>
>>106905338
AI Slop is what it says. It's "slop" which is careless trough food for pigs, AI, as in it's made by a generative AI model. Basically it's low quality mass produced content and is in the same vein as Unity Asset Store game flips on Steam. It's someone only a amateur without taste can produce usually for cheap attention with zero effort.
>>
>>106905339
experiment:

give people AI images and "real" images and tell them to browse in a completely white room

after an hour, ask them how much time has passed

if the people who looked at AI images on average say more time has passed, their brain was processing more new information and we're fucked

if the people that looked at real images say more time has passed, then AI is hard to look at because it is too familiar

basically, it has been shown that children experience time slower because they're actually taking in more new information. So a 2 hour car drive to them is subjectively longer. They tested this on flies and some other animals
>>
File: AniStudio_output-00396.png (1.68 MB, 1024x1344)
1.68 MB
1.68 MB PNG
why is someone compelled to post reddit screenshots on /g/?

anyways, almost done reworking the anistudio architecture. shared libs will probably work out better for the long run and just make things simple for registering custom crap to the managers. also makes it possible to have mismatching dep versions just wprk anyways so long as the base classes or api isn't dramatically changed.
>>
>>106905370
why you are compelled to post here is the real question
>>
File: 1888 Benz Car.png (298 KB, 550x412)
298 KB
298 KB PNG
>>106905334
it's true though, AI is in the vast majority plastic slop, it's not because the bad image comes from a new machine that it means the bad image becomes a good image, the newess has nothing to do with the quality, it's even the opposite, usually a new product is completly shit, no one run 1800's cars anymore because they fucking suck, yet those were the "new product" we're talking about
>>
>>106905378
discussion of Free and Open Source Text-to-Image/Video Models
>>
>>106905339
It looks highly averaged out.

for example, the majority of people have either the left or the right eye a bit too high/low.

ai tends to be average - perfect. It's kind of uncanny, a few people are like that, but it's unnerving.

Your hamburgers are another example. Those are the creepy "art" hamburger. Real hamburgers don't look like that.

Another slop thing is defying physics. like the cow. Cows are crazy heavy lol
>>
File: WanVid_00007.webm (951 KB, 960x720)
951 KB
951 KB WEBM
I dunno why he insists on coming here, there's an olive garden across the street
>>
File: ComfyUI_07385_.png (1.91 MB, 1152x1152)
1.91 MB
1.91 MB PNG
>>106905306
In addition to what this anon says
>>106905338
AI slop is also simply not being able to gen smut when you ask it to, outright refusing requests, or asking for a woman and then it giving you a man.
>>
File: 1750253200100058.mp4 (2.06 MB, 720x720)
2.06 MB
2.06 MB MP4
>>106905279
>>
File: 1748164734075104.png (288 KB, 2729x1270)
288 KB
288 KB PNG
https://pytorch.org/blog/flexattention/
pytorch 2.9 got released, is this shit better than sageattention though?
>>
File: 1742253901483284.mp4 (301 KB, 608x480)
301 KB
301 KB MP4
>>106905306
absolutely braindead take
>>
>didnt make it to the collage
shit thread
>>
File: ComfyUI_07390_.png (1.78 MB, 1152x1152)
1.78 MB
1.78 MB PNG
>>106905429
Nice
>>
File: ComfyUI_00002_.mp4 (1.43 MB, 640x640)
1.43 MB
1.43 MB MP4
>>
>>106905306
"AI Slop" means the kind of worthless gens that you get when you don't try hard enough, don't have good ideas, or aren't patient. It's the "easy" results from over-constrained inputs, it's lazy ideas like "donald trump hitting taylor swift with a hammer", it's all the ways that mediocre people make mediocre use of AI to generate boring worthless images that have no reason to exist. It's associated with overcooking, 'default styles', and other tell-tale signs that the image was made by AI, but it doesn't merely mean those things; rather, those things are signs pointing to and elements constituting the mediocrity which is the soul of the slop image.

Because the significant thing about AI image gen is that it's way too easy to make images which seem more interesting than they are, and without the filter of effort (remember when you had to make images in photoshop?) the internet is now awash in content that is only as interesting as the thought and spirit that went into making it, which is in most cases 'as little as possible'
>>
ok bros fuck it, I'm building the nunchaku wheel with gwen loras support myself, that'll teach the stupid [insert east asian slur] griftoid
>>
>>106905622
art is for fags anyway, im glad ai is flooding the sector. artfags get a reality check so they go do a real job
>>
>>106905662
slur means... what?
>>
>>106905306
How did you even come to this retarded conclusion?
>>
File: ComfyUI_00011_.png (881 KB, 1024x1024)
881 KB
881 KB PNG
Euler... is a sin?
>>
>>106905670
It's cosmically likely that the next Hitler will be an artist who was pushed into other areas by AI art.
>>
File: ComfyUI_00013_.png (682 KB, 1024x1024)
682 KB
682 KB PNG
>>106905686
>>
File: ComfyUI_07014_.png (3.1 MB, 2560x2560)
3.1 MB
3.1 MB PNG
>>
can you gen some petite women instead of these monstrous cowtits hags? thanks
>>
>>106905290
absolute kino
>>
is there a gguf for chroma?
>>
>>106905820
Yes.
>>
>>106905822
what version you recommend or just get the latest?
>>
>>106904570
There is no ideal number of images. Depends on what a model already "knows", how complex a subject is, etc. Try with what you have, fail fast.
>>
>>106905079
I just realize grok has upped their filter now
>>
What's the most bloated model I can go get?
>>
>>106905854
hunyuan image 3
>>
File: wtf.png (94 KB, 880x872)
94 KB
94 KB PNG
There's something seriously fucked with ComfyUI on my end. This is an illustrious model doing a simple tiled diffusion upscale @ 2x. The first set of peaks are during the Tiled VAE Encode, the second when KSampler first kicks in.
For that second set of peaks, it's not even genning, it just sits there bouncing up and down for like 10-15 seconds. Where it smooths out is when it finally starts to gen.
It does this with a lot of models and workflows, I've noticed. It'll, for no reason I can discern, max out my VRAM during some loading phase, then smooth out. Anyone else have this happen?
>>
>>106905866
Also, it doesn't do this with forge/reforge. Just Comfy.
>>
>>106905866
>>106905871
skill issue
>>
>>106905866
Update your pytorch and if that doesn't work, go scorched earth and reinstall from scratch
>>
File: wtf2.png (6 KB, 626x183)
6 KB
6 KB PNG
>>106905866
Weird thing is, it mostly does it on first model load or something. After awhile, it smooths out completely, and transitions smoothly between each step in the workflow. This is the tiled vae encode and ksampler run after running the workflow a few times.
>>106905899
Worth a shot.
>>
File: ComfyUI_00017_.png (1.77 MB, 1024x1024)
1.77 MB
1.77 MB PNG
>>106905760
>>
>>106905910
you might have some nodes pinning the VRAM memory, so it has to do a lot of 'flushing'
>>
>>106904838
this except it's also amd but 64gb system ram
sdxl gens are pretty fast especially considering it's the same or even less time to run in batches of 4 so it's acceptable.
Qwen Edit is annoying though at 5-15 minutes per.
>>
>>106904996
where do you find customers?
>>
File: ComfyUI_06876_.png (3.33 MB, 2560x2560)
3.33 MB
3.33 MB PNG
>>
File: file.png (394 KB, 1546x858)
394 KB
394 KB PNG
>>106905662
>it works
nunchaku sirs, we're home!
>>
>>106905983
The fuck haven't they merged that yet. Lazy bastards
>>
>>106905290
Any lora to get that effect?
>>
>>106905290
perfection
>>
I need some snake oils for chroma
>>
>>106906258
Fuck, I just downloaded Lumina knowing I have an eight gig card! What've'you done did?
>>
>>106905622
I don't know. I've seen AI content where the creator puts in way too much effort but you still get the droves of "ew ai slop" crowd while another account that spams generic dark fantasy slop over and over gets praise and comments saying "this is how ai should be used!!!" so I don't really trust the general consensus on this.
>>
File: denino.png (994 KB, 1360x768)
994 KB
994 KB PNG
>>
>>106905866
I just realized mine does the same thing but I thought it was normal
Did you manage to fix it?
>>
>>106906512
I tried removing all custom nodes, changing pytorch version, etc, but in the end I needed to do a fresh install of Comfy. That fixed it completely. Not sure what it was, some broken dependency or something. My former Comfy install was pretty bloated though.
>>
>>106906441
is that ice poseidon?
>>
File: blimey.png (949 KB, 832x1248)
949 KB
949 KB PNG
>>106906553
yea
>>
>>106906005
no, the guy doing nunchaku is actually a student and uni began again, he's not spent much time on nunchaku lately sadly.
>>
File: fun.webm (765 KB, 720x960)
765 KB
765 KB WEBM
>>
File: ComfyUI_03387_.png (1.58 MB, 1024x1024)
1.58 MB
1.58 MB PNG
>>
>>106907336
cronger
>>
File: ComfyUI_03393_.png (1.5 MB, 1024x1024)
1.5 MB
1.5 MB PNG
>>
File: ComfyUI_03403_.png (1.91 MB, 1024x1024)
1.91 MB
1.91 MB PNG
>>
File: ComfyUI_03418_.png (1.32 MB, 1024x1024)
1.32 MB
1.32 MB PNG
>>
File: ComfyUI_03421_.png (1.29 MB, 1024x1024)
1.29 MB
1.29 MB PNG
>>
File: ComfyUI_03425_.png (1.87 MB, 1024x1024)
1.87 MB
1.87 MB PNG
>>
>its time for my mega dump of blurry shit
>>
File: ComfyUI_03461_.png (1.49 MB, 1024x1024)
1.49 MB
1.49 MB PNG
>>
File: ComfyUI_03487_.png (1.35 MB, 1024x1024)
1.35 MB
1.35 MB PNG
Alright last "blurry shit" aka "a photo of"
>>
File: ComfyUI_02981_.png (1.74 MB, 912x1160)
1.74 MB
1.74 MB PNG
>>106904218
Put /ldg/ in title next time.
>>106905079
Technically yes but video diffusion with 12gb is rough.
You want a stronger card with more VRAM, ideally 24+ and 64+ system ram.
t. 3060 owner with 32 gigs of system ram.
>>
>>106907645
man just look at her eyes here, dot it look like a 'photo of' instead of mangled pixels? get a reality check, retard
>>
>>106907645
These were Krea, right?
>>
File: file.png (66 KB, 1250x436)
66 KB
66 KB PNG
>>106907670
>man just look at her eyes here
>amazing eyes for flux-based model at "background character" distance

>>106907675
>Krea
Chroma1-HD-Flash inference of LoRA trained on Chroma1-HD using CAME. After yesterday's conversation about optimizers, I was inspired to try it again. It validates waaaay better than AdamW with the same settings
>>
>>106907711
yeah dude AMAZING, each of your gens has the eyes as pixellated fuckage but keep coping retard.
FANTASTIC
>>
>>106907645
looks shit if i'm honest.
and if I'm brutally honest, you should stop and give up right now. you do not have the eyes or taste for this hobby.
>>
File: ComfyUI_03520_.png (2.19 MB, 1152x1152)
2.19 MB
2.19 MB PNG
>nogen
>shitter
Who could have predicted this outcome?
>>
>>106907844
Post more
>>
File: ComfyUI_03531_.png (1.91 MB, 1152x1152)
1.91 MB
1.91 MB PNG
>>106907878
Work time soon
>>
File: ComfyUI_03532_.png (1.93 MB, 1152x1152)
1.93 MB
1.93 MB PNG
>>
dude, I am a #1 Chroma stan, but your gens are shit. Look at the clean elements in the images you posted, like the skies, there are plenty of artifacts in it, plus the women look blurry/smudgy.
>>
>>106907955
Show us your gens then?
>>
>>106907955
I am a Chroma hater (kinda) but even I know that Chroma shouldn't be this fucking grainy.
I am curious about his workflow and if he is using some heavy quant.
>>
>>106907979
I would unironically generate an image using one of your prompts, but right now I am in the middle of a lora training.
>>
>>106906549
>>106906512
>>106905866
probably because you used --fast without specific arguments, which enables all --fast optimizations
and since cumfart added a new optimization some weeks ago that was trash, it got activated for you

try just "--fast fp16_accumulation"
>>
When enslopification becomes the new normal, will normal look uncanny?
>>
>>106907645
it takes skill to make a workflow which turns a modern model into 1.5 overbaked lora gens, post your workflow so it can be fixed or stop posting this slop
>>
>>106905290
Awesome
>>
File: 1752502016755320.gif (66 KB, 112x112)
66 KB
66 KB GIF
>>106904996
>>106905006
>>106905021
any catbox? hot
>>
File: ComfyUI_03524_.png (2.07 MB, 1152x1152)
2.07 MB
2.07 MB PNG
Hey >>106908023, >>106907979 is another anon

Been in meetings anons.. Here's a catbox, disabled --fast, makes little difference with enabled

https://files.catbox.moe/dwy7g9.png

I'm not really here to impress people but thanks for caring guys
>>
I love how pedo ani waits until off hours to post now
I'm just going to post the logs during peak hours since you're too fucking stupid to take a hint. Last warning before I post the catbox of your bondage shota from your gooning discord where you ERP with other men.
>>
>>106908255
>post the catbox of your bondage shota
disgusting... you mind posting it already?
>>
>>106908255
Literally who
>>
>>106908255
lol you love ani
>>
File: ComfyUI_03558_.png (1.7 MB, 1024x1024)
1.7 MB
1.7 MB PNG
Forcing the VAE to FP16 seems to produce less artifacts in the background. Are all the nogens looking from a phone? They do look bad/grainy on phone and fine on 100% zoom with desktop for me
>>
>>106905146
>>106905160
Have to remake Married with children lora for Chroma
>>
>>106908255
>bondage shota
Bro really thinks this is something serious.
>>
Fuck it I'll post it.
>>
>>106908323
why wasn't it this post?
>>
>>106908316
It is when you doxxed yourself and made yourself a pariah that desperately begs for support. I won't say anything more because only he needs to see this.
He behaved for a bit but probably hit the bottle again
To the faggot playing defense, it won't work I said what I needed to say. You can samefag on your phone but I gave my warning.
>>
Trani lost a ton of support after he got caught the first time LMAO
>>
I haven't touched stable diffusion in almost a year. Is NoobAI still the go-to model or did something better come along? Same with SwarmUI, I was using that instead of reForge.
>>
>space where everyone is afraid of pedos abusing software
>gets outed as a pedo
>again
You're a lolcow Julien
>>
why does ran save shota porn on her computer?
>>
>>106908386
noobai is still king. you can try neta yume lumina as an alternative (for animeme) but it has basically no loras
>>
File: image_00030_.jpg (2.5 MB, 1680x2160)
2.5 MB
2.5 MB JPG
>>106907711
I use way lower LR with CAME. Adamnw gives nice results with 0.0003, but with CAME I use 0.00008 for the same dataset.
>>
>>106908386
>Is NoobAI still the go-to model
sadly, yes
>>
>>106908409
i don't think he understands the type of people that regularly use 4chan desu
he just makes ani sound more and more based
>>
>space where everyone is afraid of pedos abusing software
lol?
>>
>>106908430
ani IS based. he's the only one here who is skilled enough to put together an exe
>>
>chroma
this has to be one of the few websites left in the world where someone can get multiple responses saying they are wrong and they can just keep posting right through it. schizos absolutely convinced they are genning masterpieces and everyone else is a troll lmao
>>
>>106908513
solution is to not engage
>>
>I'm going to expose ani with this evidence I completely made up
>I am going to post it I swear
>no really I am going to do it
>*leaves*
>>
File: ComfyUI_00695_.png (2.64 MB, 1360x1744)
2.64 MB
2.64 MB PNG
>>
>>106908513
>schizos absolutely convinced they are genning masterpieces
thats not just on /g/ or about AI, you have these type of retards on literally every board, they spam some content they think is good but everyone else says its shit and annoying and he should fuck off and die.
I feel like I seen this a million times play out.
>>
>>106908386
Yes if you are making /a/, /h/, /co/, /aco/, /trash/, etc. content.
Use v-pred 1.0
>>106907711
>>106908420
I have only trained a few loras.
Is there a reason you guys don't use prodigy?
>>
>>106908588
>Is there a reason you guys don't use prodigy?
telemetry
>>
>>106908605
???
>>
>>106908605
then why do you use comfy?
>>
>>106908618
>then why do you use comfy?
Too dumb to script.
>>
>>106908588
>Is there a reason you guys don't use prodigy?
I get better results with Adamw and CAME. That's it really
>>
File: flux.jpg (717 KB, 1664x2432)
717 KB
717 KB JPG
>>106907711
>amazing eyes for flux-based model at "background character" distance

this is literal flux, flux based isn't bad like that
>>
The harsh truth is that Flux Schnell and Flux Dev are better and generate better results than Chroma
>>
>>106908710
chroma looks like an ancestral sampler with too few steps. like it's underwater or something
>>
The harsh truth is that Chroma is better and generate better results than Flux Schnell and Flux Dev
>>
File: chroma_flux__0025.png (1.6 MB, 832x1216)
1.6 MB
1.6 MB PNG
the only thing i really like about chroma is when you can use it like an MoE with a secondary model. Using Chroma for the first half of the gen can take advantage of the uncensored data set and get some creative looking people. then finish the gen with flux so it doesn't look like grainy ass. this is actually very slow though so debatable if it's even worth it.
>>
File: chroma_flux__0020.png (1.45 MB, 832x1216)
1.45 MB
1.45 MB PNG
it also produces some women that would actually sleep with me
>>
>>106908870
Even better, just train a Chroma LoRA using Flux. Since Flux is the best, just use Flux as the data to fix Chroma.
>>
>>106908870
nipple, delete
>>
>>106908880
yikes bro
>>
>Infinite Vid Wait Room Edition
>no infinite vid release

sigh..

>>106908870

chroma is fun, you can image 2 image creating even more goofy variations

>>106908892

the anon fears the nipple
>>
>>
Give it to me straight, we'll never get a local Udio will we?

Seems like all open source music gen models are leagues behind closed source.
>>
>>106909166
I think it's intentionally sabotaged because the music industry is extremely predatory over copyright
>>
>>106909166
there's songbloom dpo, but it's kinda meh. Interesting but ultimately MEH.
>>
>>106909179
China doesn't have the most basic protections for it though.
>>
File: chroma_flux__0052.png (1.44 MB, 832x1216)
1.44 MB
1.44 MB PNG
>>106908978
chroma gens taste like mental illness, that's the fun part
>>
File: ComfyUI_03563_.png (1.59 MB, 1024x1024)
1.59 MB
1.59 MB PNG
>>
>>106909166
What is the best option for local?
>>
>>106909183
>songbloom dpo

Tried it. Not enough control and very basic instruments, plus their copyright changed to closed source.

I want to be able to generate high quality Japanese songs like this
https://www.udio.com/songs/dFTtQHCqxbHLyArX4vx6QZ
YuE is kinda there but worse sound quality. ACE Step sounds too robotic. And I hear talks on their Discord of possible non copyright infringing dataset. Since it would need a decent copyrighted dataset, I'm guessing it's just never gonna happen.
>>
>>106904218
Fuck off! AI board now
>>
>>106909252
>AI board now
AI bored now.
>>
File: ComfyUI_03612_.png (1.17 MB, 768x1280)
1.17 MB
1.17 MB PNG
>>
>>106909344
lit!
>>
File: file.png (2.23 MB, 1328x1328)
2.23 MB
2.23 MB PNG
>>106909344
>>
>>106909408
>plasticmaxxed
>1:1 ratiomaxxed
>contrastboostmaxxed
>slopmaxxed
let me guess qwen image with no loras?
>>
File: file.png (2.45 MB, 1328x1328)
2.45 MB
2.45 MB PNG
>>106909416
>sneething this much
>>
>>106909432
>cant gen a realistic image to save his life
>>
>>106909443
>shitposting
>nuh uh its not realistic
>for literal 8~ seconds gens
LMAO my dude, keep copering with chromer
>>
>>106909449
if a model cant gen realistic images its worse than chroma, sorry you feel so attached to your models you formed your identity around it
>>
>>106909466
>literally malding after 2 shitpost images on chroma
>somehow I'm the one caring and attached to any other slop model
LOL dude you're a goldmine, please keep 'em coming
>>
>>106909480
>if i accuse the oponent of malding i dont have to showcase a relistic non-slopped image of my model
bro is actually mindbroken by an ai image model existing online
>>
File: 1740179584824261.gif (3.89 MB, 200x200)
3.89 MB
3.89 MB GIF
Man why the fuck is stable diffusion shit so complicated? Why does everything require you to run python or some shit? If AI is so great how come you can't just get everything working with a simple exe file? Pajeet ass tech
>>
>for literal 8~ seconds gens
imagine conceding to using speedup loras
>>
>>106909494
at least it isn't imagining shota porn and saying you will post it only to retreat to a padded room
>>
>>106909510
you seem to have replied to the wrong person
>>
>bro thinks shota porn is bad
Go back
>>
>>106909505
welcome to the anistudio waiting room >>106905370
>>
>>106909505
>simple exe file
if only it were so simple

>>106909523
schizo is having melties over someone finally making an exe? why?
>>
>>106909544
>schizo is having melties over someone finally making an exe? why?
extreme jealousy and african genetics
>>
>>106909520
kys faggot
we only enjoy loli here
>>
>>106905370
>why is someone compelled to post reddit screenshots on /g/?
lilbro does it ALL the time fuck
>>
>>
>ywn plant your face in ani's sweaty cat girl loli butt
why even live?
>>
File: masked automagic.gif (1.43 MB, 210x314)
1.43 MB
1.43 MB GIF
Random project of the day, Automagic with 3D Perlin Noise masking of layers.
>>
Didn't trani get caught posting images of little boys last year?
>>
>>106909690
Are you masking whole images? for what purpose?
>>
>>106904218
>Infinite Vid
The code is there. The model is there. Why isn't some nerd like kj adapting it?
>>
>>106909717
No, masking the layers of the transformer model so that different parts of the weights are activated over time as a regularization and plasticity training method. I'm curious if it would help with the lazy middle layer problem which is most apparent in Flux.
>>
>>106909724
unless it will run 16+fps on a 3090 its worthless
>>
>>106909751
It doesn't require any more VRAM than the base Wan model. Just time.
>>
>>106909724
he's probably waiting for the wan 2.2 version
>>
>>106909738
That's pretty interesting, I thought it was only for training samples. I guess layer filtering isn't robust enough
>>
>>106909246
Kek, Udio is just too good.
https://www.udio.com/songs/3zWurqZBj5squfdGgGqut3
The model understands not just music but different accents, and the devs have figured out a way to make the composition nearly flawless (or at least from the way this guy prompts his songs).

Whoever made this is like BFL but for music.
>>
>ywn be a bound shota as ani's sweaty cat girl loli plants her butt on your face
why even live?
>>
>>106909701
Yes and he got dragged for it
>>
>>106909776
I think optimizer problem hasn't been looked into enough at all, instead of throwing more parameters on the model, maybe it's about making the existing parameters more efficient and even help with the extreme data problem as theoretically you should get more mileage from a smaller dataset with a method like this as each neuron will learn different things from the same sets of images as they're all activated and deactivated over time forcing new learning approaches and maybe acting like a self-pruning training method where inefficient neurons upon being deactivated for a few thousand steps get repurposed to something more efficient later when they're reactivated.
>>
>>106909166
There is zero interest from most companies to even train a model, let alone make it local.
Udio (and others) will probably remain the best for a while.
>>
>trani uses the same cope as above
>ended up with him being "soft banned" from the general
Why did you give up on the trani studio general?

No traction?
>>
*yawn*
>>
>>106909166
very little use case of audio gen versus hude pressure of jewish music industry handlers creates little incentive for anyone to go out of their way to train a new type of gen ai instead of just working on making improvements on the much more important text/image/video gen models

basic audio gen is already there, and everything else is just allowing regular people to make songs to post online and thats it, theres no grand use case for the general public or business that companies can satisfy and earn money from
>>
>>106909841
>hude
huge
>>
*nway*
>>
File: bakeaprettycake.png (2.73 MB, 2048x1024)
2.73 MB
2.73 MB PNG
>tfw the lora samples getting cheeked up
>>
>>106909809
My technical understanding isn't sufficient for this, but I've been wondering why they are trying to make these new models good at everything instead of making smaller, more compact, specialized models. I also don't understand why something like text generation is being held as a benchmark; you could probably have a smaller ControlNet model take care of it completely.
>>
>>106909841
Yes, but on the same note then neither Udio nor Suno should exist, and probably won't for long, because they would face same scrutinity as open source devs, and yet they still do. Open source is the only way for this tech to survive. Once it's out there, there's nothing the music industry can do short of suing those who use it commercially (which would then be soon on its way to being unenforceable).

Coding models are disruptive. Image models are disruptive. So too can be music models.
>>
>>106909724
He probably doesnt even know it exists, no one is talking about it apart from these generals and maybe 1 thread on leddit, kek

>>106909775
2.1 loras work (mostly) fine on 2.2 :(
>>
>>106909893
A model that is good at everything is the trillion dollar model, but yes, smaller specialized models should be pursued more as they're easier to training and require much less parameters.

>why text to image
Because that's how we communicate, in the professional world that's how you get an artist to make something you want, you say words and maybe show a picture or two. Which in a way is how Kontext and QIE work.
>>
File: OOM.png (16 KB, 505x197)
16 KB
16 KB PNG
>"thank god AMD has finally caugh up and my 7900XTX investment wasn't all for nothi-ACK!"
I should write a script that automatically replaces all of these with the tiled version, fucking bluescreen landmines.
>>
File: 1422449559229.jpg (16 KB, 330x344)
16 KB
16 KB JPG
>7900XTX investment
>>
File: ComfyUI_temp_mpreu_00025_.png (1.94 MB, 1320x1024)
1.94 MB
1.94 MB PNG
>>
>>106909976
Funny thing, I'm OOMing with RAM on this node in wan t2i workflow after any lora switch, but i2v works just fine. Linux shit just freezes for ten minutes before oomkiller do the work.
>>
>>106910085
pls respond anon, which base model + lora is that?
>>
need a buttchin lora
>>
File: ComfyUI_06057_.png (1.27 MB, 960x1080)
1.27 MB
1.27 MB PNG
>>
>>106909976
shit like this is why comfy is terrible design. they just keep piling on redundant nodes that bloats the selection and if they did get rid of it it would break almost all workflows. fucking retarded
>>
File: 69420_00002_.png (423 KB, 512x640)
423 KB
423 KB PNG
>>
File: 1747144509978409s.jpg (5 KB, 184x250)
5 KB
5 KB JPG
>>106910088
>the evil digital spirit known as OOMRAM strikes again!
>LOCALS urged to purchase more RAM to appease the latent beast
>>
File: 00040-2585062173.png (1.93 MB, 1248x1824)
1.93 MB
1.93 MB PNG
>>
>>106910088
Have you tried using a Clean VRAM node before it?
>>
>>106910114
this man beefing with open source
>>
>>106910127
open source or not, retardation is retarded
>>
>>106910137
noooo everyone stop contributing to the open source project in a way i don't like, noooooooo
>>
>>106910094
its chroma and no lora anon, just prompting
>>
File: 00047-2446453550.png (2.17 MB, 1824x1248)
2.17 MB
2.17 MB PNG
>>
File: ComfyUI_06087_.png (865 KB, 1048x992)
865 KB
865 KB PNG
>>
>>106910199
>its chroma
Post proof in the form of a catbox
There is no way Chroma can do that on its own
>>
File: image_00095_.jpg (860 KB, 1264x1712)
860 KB
860 KB JPG
>>106910199
looks like old tv-show prompt from 1.5 days
>>
>>106910217
>There is no way Chroma can do that on its own
lol
>>
>>106910231
I say this as someone who regularly uses Chroma and trained many loras for it

It might also be a Saasfag trolling
>>
>>106910252
Why not? It's just a question of good enough prompt and seed gacha.
>>
>>106910252
I'm afraid you suffer from skill issue
>>
>>106910275
Very nice cleavage.
This could have be a game ad in the 90s.
>>
>>106910252
c'mon now
>>
File: 00020-3350621362.png (2.5 MB, 1536x1536)
2.5 MB
2.5 MB PNG
>>
File: 00053-2630641514.png (2.3 MB, 1248x1824)
2.3 MB
2.3 MB PNG
>>
>>106910275
>still no catbox
fuck off from /ldg/ (LOCAL diffusion general)
but the pic you just posted seems doable on Chroma though

>>106910273
I never got base Chroma to produce anything remotely close to this:
>>106905092
And I used to prompt a lot "1990s sitcom still" stuff, most of the time I used to get low quality garbage. Adding negative prompts would just ruin the aesthetic and produce slop.
>>
>>106910165
>Used RTX 3090 = Rp 8.500.000 (~520 USD)
>Used RTX 4090 = Rp 22.000.000 (~1350 USD)
>HOW THE FUCK ????????? Buying two RTX 3090 is still cheaper and you get twice the VRAM.... Is it possible to use 2 GPUs simultaneously to generate vids ?
nta but can someone maybe answer this one? "simultaneously" obviously as in both GPUs contributing to same inference task, as opposed to running 2 parallel tasks at the same time
>>
File: image_00101_.jpg (729 KB, 1264x1712)
729 KB
729 KB JPG
>>
Is GeForce RTX 5060 Ti 16GB in vramlet territory?
>>
>>106904845
>>106904839
>>106904636
>>106904562
these look like SD1.5 or something.. why are they such shit quality?
>>
Using Chroma I always get plasticky pictures using the official comfy workflow. What gives? What is the secret to proompting this model correctly?
>>
this whole thread is filled with just trash quality.. why do the majority of these posts look like they're jpegs from the 90s?
>>
>>106910403
you cannot use two gpus for one task. You can offload models to the second gpu but that is (mostly) pointless for video gen. Also, at this point a 4090 is 70-80% faster than a 3090 and well worth the price. Given all the blackwell features you could argue a 16GB 5080 is a better buy if a used 4090 is actually $1350.
>>
wheres the goddamn baker?!
>>
File: file.png (1.51 MB, 1024x1024)
1.51 MB
1.51 MB PNG
Flux
>>
>>106910124
Yes. Also Intel Arc GPU.
>Attempting to free GPU VRAM and system RAM...
>CUDA is not available. No GPU VRAM to free.
>Garbage collector: collected 560 objects.
>System RAM: Initial usage: 66.30%, Final usage: 66.30%, Freed: 0.00%
>>
File: WanVid_00037.webm (731 KB, 960x720)
731 KB
731 KB WEBM
close enough
>>
>>106910546
He's taking his time making the perfect collage. He'll post the link just moments before this thread gets archived
>>
File: file.png (1.72 MB, 1024x1024)
1.72 MB
1.72 MB PNG
>>106910555
Same prompt with Chroma. It looks...undercooked? IDK
Some help here? Using Euler with CFG 4
>>
>>106910484
From the sounds of their posts, multiple 2MP passes per image at 60+ steps each. For being based on the smaller Schnell, Chroma is brutally slow for some reason.
>>
>>106910598
you got conned into downloading garbage, sorry broski
>>
File: file.png (1.8 MB, 1024x1024)
1.8 MB
1.8 MB PNG
>>106910598
This post was 26 steps. Pic related is 60 steps.
>>106910599

Man I thought Chroma was uncensored Flux, seeing what people post around here. But I can't get it to produce anything worthwhile. I'm sure I'm doing something wrong.
>>106910630
I got pretty good results with earlier checkpoints, so I don't believe that. You won't have me passing on kino no sir.
>>
>>106910598
perhaps looks basic workflow from civitai and put some negative
>>
File: 00058-3443781915.png (2.78 MB, 1824x1248)
2.78 MB
2.78 MB PNG
>>
File: 1713319706206.jpg (88 KB, 750x1000)
88 KB
88 KB JPG
>>106910559
I can maybe understand boughting a AMD graphics but... intel?
>>
>>106910598
post your prompt
>>
>>106910651
a photo taken with a teleobjective of a beach. there is a low metal structure prominently taking most of the shot next to the shore. in front of this 1 story building, there is a blue pickup truck parked. two men are standing wearing overalls and hard hats, and a large wooden crate is between them. in the background, a large rock spire can be seen in the background, in the water. on one side, stairs go up a cliff, and a few buildings can be seen in the distance. the sky is blue, and the sand is pale yellow
>>
>>106910662
It's just some random prompt I made reproducing a location from YU-NO. But suddenly recreating a classic VNs with AI-generated images seems like a perverse goal I should pursue...
If I get Chroma to generate what I need.
>>
File: ComfyUI_temp_tvodp_00001_.png (2.04 MB, 1024x1024)
2.04 MB
2.04 MB PNG
>>106910662
Are you using some memeversion of chroma? Euler/simple 30 steps.
>>
>>106910650
I had AMD card before, so I was not scared.
>>
File: ComfyUI_temp_tvodp_00002_.png (1.92 MB, 1024x1024)
1.92 MB
1.92 MB PNG
>>106910693
>>106910662
res multistep/bong
>>
File: ComfyUI_00001_.png (165 KB, 512x640)
165 KB
165 KB PNG
>>
File: file.png (385 KB, 2026x866)
385 KB
385 KB PNG
>>106910693
Can you post your workflow?
I'm using the workflow from here: https://huggingface.co/lodestones/Chroma1-HD

But the model is Chroma, not HD.
>>
>>106910403
No because it's literally a synchronous process that basically requires the entire model loaded into VRAM to work and each step is dependent on the previous one and extremely difficult if to split the math of a single step across two GPUs and the only people motivated to do this are poorfags (read: stupid).
>>
where be the goddamn BREAD
>>
>>106910712
Why are you running empty neg prompt retard.
https://files.catbox.moe/xmjegf.png
>>
File: file.png (39 KB, 1921x1209)
39 KB
39 KB PNG
>>106910738
Thank you.
>>
File: file.png (1.74 MB, 1024x1024)
1.74 MB
1.74 MB PNG
>>106910738
It's barely better with your neg prompt, and your monster of a workflow is not something I want to deal with.
I've been blackpilled on chroma today. Flux + inpainting with other models is the only way forwards for me.
>>
few mins
>>
File: 00071-4123617595.png (2.32 MB, 1248x1824)
2.32 MB
2.32 MB PNG
>>
>>106910887
>>106910887
>>
>>106906985
Kek
>>
>>106910454
yea mostly but you can make it work with 64gb system ram pretty ok
sdxl will be no problem but larger models like qwen will be using system ram



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.