[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: 1739829461354990.jpg (1.46 MB, 2521x2558)
1.46 MB
1.46 MB JPG
Discussion of Free and Open Source Diffusion Models

Prev: >>108033820

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>Anima
https://huggingface.co/circlestone-labs/Anima

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
File: ComfyUI_temp_epnuh_00025_.jpg (328 KB, 1824x1248)
328 KB
328 KB JPG
https://litter.catbox.moe/epy5sxdo84fxt3eo.png
>>
Blessed thread of frenship
>>
File: ComfyUI_temp_epnuh_00033_.jpg (456 KB, 1824x1248)
456 KB
456 KB JPG
blessed thread of frenship
https://litter.catbox.moe/f0f2d7bivwmbhqqj.png
>>
File: Flux2-Klein_00491_.png (373 KB, 544x544)
373 KB
373 KB PNG
>>
I claim this thread for NUCLEAR WARFARE
>>
>>108035720
>>Maintain Thread Quality
>https://rentry.org/debo
>https://rentry.org/animanon
you never learn schizo. you never learn
>>
>>108035751
kill ani
>>
>>108035747
why is the white man so carbrained?
>>
File: ComfyUI_temp_epnuh_00036_.jpg (367 KB, 1824x1248)
367 KB
367 KB JPG
https://litter.catbox.moe/4k1w55omnsbjblex.png
>>108035758
the masculine urge to go vroom vroom is stronger in them i guess
>>
what nationality is tdrussell i need to know
>>
>>108035758
never seen toyko drift?
>>
>>108035751
>>Maintain Thread Quality
I wish people would do this more often and not sperg out.
like why cant ldg be comfy?
>>
File: ComfyUI_temp_epnuh_00042_.jpg (508 KB, 1824x1248)
508 KB
508 KB JPG
>>108035764
thanks fren
https://litter.catbox.moe/n4g8qhtkxzvdkugq.png
>>
File: ComfyUI_temp_epnuh_00058_.jpg (551 KB, 1824x1248)
551 KB
551 KB JPG
>>108035775
https://litter.catbox.moe/1gzki70nfk94qdh1.png
>>
>>108035767
i assume generic white guy prolly given the name. Definitely from US or Canada, doesn't communicate in English like a European.
>>
>>108035796
>burgers finally wonned
nice
>>
https://litter.catbox.moe/aspirdm5a3dgv3fw.png
>>
https://litter.catbox.moe/o8931ydsy7e5ayvb.png
>>
https://litter.catbox.moe/ffg7yod9pm0qujzq.png
>>
>>108035796
just like astralite heart. not a good look
>>
https://litter.catbox.moe/3dno9bvdahb5tfz8.png
>>
https://litter.catbox.moe/4dxzcollbl9i0ejw.png
>>
>>108035704
Hell yeah.
>>
litter went down just now, awesome :)
https://files.catbox.moe/mypsso.png
>>
how do I unslop ltx2?
>>
>>
>>108035882
we gave up on it because they trained with suno and udio gens
>>
>>108035827
>noooo he has to be Chinese and inflict dogshit machine translated bullshit captions and stupid cultural biases upon all users of the model
>>
>>108035877
Use a base image from another model and use an I2V workflow
https://files.catbox.moe/niyz6j.png
>>
>>108035886
I don't think you understand. I don't care if it's a suno/udio distill.

LOCAL
O
C
A
L
>>
>>108035900
garbage in, garbage out
>>
>>108035886
((((((((we))))))))
>>
https://files.catbox.moe/ox11yr.png
>>
>>108035882
he means 2nd
>>
>>108035846
nice
>>
>>108035887
>it's my first time
>pay me
>>
https://files.catbox.moe/e86bay.png
>>
>>108035751
There's a thread without it perhaps /sdg/ is more your speed?
>>
>>108035915
like I said in previous thread this makes no sense unless you're admitting to being a corporate SAASjeet inference service yourself
>>
>>108035911
kino
>>
>>108035882
>two more days
>>
>>108035914
thanks fren :)
https://files.catbox.moe/poms68.png
>>
>>108035934
yeah its kinda weird honestly. is there like some kind of SaaS company focusing on anime models that has ties like to 4chan? like started here or something?
>>
>>108035934
>fucked nlp encoder
>nooooo you are a jeet!!!
>>
>>108035936
kurosawa is the GOAT
https://files.catbox.moe/jj3bjh.png
>>
Lol https://civitai.com/models/2354972/nayelina-z-anime
Here come the snake oils, here comes the early access to failed models, here comes the imbalance of Z model due to the fried dataset from slopers, here comes the overfitting on cowboy shots, here comes the shiny skin and hair.
Another garbage checkpoint from someone who doesn't understand dataset curation or training, lets keeps pumping out half baked models, all hype and FOMO
>>
https://files.catbox.moe/emhn7s.png
>>
>>108035758
Non-whites wouldn't understand.
>>
>>108035895
that's exactly what I did but the rest of the video gets slopped
>>
https://files.catbox.moe/n5zfhg.png
>>
File: 1754425686271983.jpg (820 KB, 1184x1776)
820 KB
820 KB JPG
>>
File: ComfyUI_09449.png (3.18 MB, 1440x2160)
3.18 MB
3.18 MB PNG
>>108035494
Gay.
>>
>>108035906
Yet men are born of women.
>>
>>108035957
>lora trained on 1000 images (rofl) on a single 5090
>don't worry guys the final version will be a million images (still rofl)
>putting "Z-Anime" in the model name
>posting it everywhere and shilling
>there's a fucking website for the model
>you can sign up and make an account
>very obviously positioning it for some kind of API / SaaS thing
now THIS is what a grift looks like
>>
File: ComfyUI_temp_jflnc_00035_.jpg (1.08 MB, 1664x2432)
1.08 MB
1.08 MB JPG
>>108035974
interesting. I'm not nearly as familiar with video genning, so I can't help out that much
https://files.catbox.moe/4h6qr4.png
>>
File: ComfyUI_temp_tkpxt_00002_.jpg (746 KB, 2880x1632)
746 KB
746 KB JPG
>>108035992
if you're not griftmaxxing in 2026 you're ngmi
https://files.catbox.moe/q97i10.png
>>
>>108035998
in my testing wan is still superior in quality. I hope ltx2 can catch up
>>
Is the Comfy server borked for anyone else or did the jews finally remote-delete my system32 folder
>>
File: ComfyUI_temp_tkpxt_00013_.jpg (354 KB, 1824x1248)
354 KB
354 KB JPG
>>108036003
I've heard that using new workflows and files can help, since ltx-2's initial release and comfyui workflow were misconfigured
https://files.catbox.moe/tqdvqy.png
>>
>>108035957
Shit image demo. Do these people even generate anime once in their life?
>>
>>108036008
Works On My Machine(TM)
>>108036003
Try using this one:
https://files.catbox.moe/7r8isi.mp4
>>
File: Comparison.jpg (2.69 MB, 2592x1024)
2.69 MB
2.69 MB JPG
Klein 4B Distilled versus Klein 9B Distilled on in-place upscaling the other anons's Bateman pic from 544x544 -> 1024x1024

```Significantly improve the overall quality and detail level of photograph image 1 while keeping the original composition and layout and color palette and lighting and visual aesthetic and cross-eyed red eyes facial likeness exactly the same as it is.```
>>
>>108035946
what the fuck do you mean by "fucked NLP encoder"
>>
I'm late, but is this legit? The FP32 model for base was leaked and it's actually 24GB

https://huggingface.co/Hellrunner/z_image_fp32/tree/main

https://www.reddit.com/r/comfyui/comments/1qt88kg/z_image_base_teacher_model_fp32_leaked/
>>
File: ComfyUI_temp_ljghs_00040_.jpg (1.08 MB, 2880x1632)
1.08 MB
1.08 MB JPG
https://files.catbox.moe/fhj26m.png
>>
>>108035957
we know dude
>>
>>108036036
it's fucked like the MLP ponyfucker's encoder
>>
>>108036002
>>108036016
>>108036043
what's your prompting method? wildcards to an llm separate then copy+paste ?
>>
>>108036045
???? literally nobody knows what the fuck you mean, it uses Qwen 3 0.6B and has good adherence to tags and captions equally, explain or GTFO lol
>>
>leaked
>>
>>108036041
isn't z image base fp16 already a teacher model? what am I missing here?
>>
Don't get the complaints about models sucking. Just fine tune things into a one you like. Or combine models in Easydiffusion or what have you. Like for my example. I like dark skinned girls but default models make them too light for my tastes so I refined it myself.
>>
>>108035957
This faggot must have the entire dataset tagged in 2023 WD tagger eva large or Slop Caption which omits all the details.
>>
>>108036063
they're saying Tongyi-MAI released bf16 version by mistake, which is why it was 12gb.

if you check the repo, you can see it's 24gb now.
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo/tree/main/transformer
>>
File: 51132155454.png (77 KB, 1500x754)
77 KB
77 KB PNG
>>108035886
>>108035882
I don't agree with any of these guys, it's at least v4.5 level, but Alibaba needs to hurry up and cook so these guys can shut up about it not being Suno X tier.
>>
>>108036066
tdrustled here, I did the same
>>
>>108036080
oh wait disregard that. I was looking at the turbo repo. I guess it's still the same for the z-image repo.
>>
>>108036064
Did you refine an entire base model instead of changing tags in the prompt? This is another level of skill issue, dude.
>>
>>
>>108036066
it's really not interesting, it's just some generic CivitAIer unnecessarily running a 1000 image lora sized dataset as a full finetune
>>
File: Flux2-Klein_00496_.png (367 KB, 544x544)
367 KB
367 KB PNG
>>108036034
kinda prefer 4B? unsure.

Try this
>clean digital file, remove blur and noise, histogram equalization, unsharp mask, color grade, white balance correction, micro-contrast, lens distortion correction.
>>
File: bitmap.png (2.73 MB, 2732x1536)
2.73 MB
2.73 MB PNG
>>108036052
I ask chatgpt to generate prompt ideas for me; I've a custom GPT for generating image prompts; i either upload a reference pic or go "okay here's my idea, write a prompt for it in this style" since right now the meta is writing multi-paragraph prompts in natural language, the "war and peace" approach
>>
>>108036132
Eyes reminded me of Rodney Dangerfield for whatever reason
>>
z-image optimization when
>>
>>108036125
500k images should be the absolute minimum for finetunes. I have person loras that are 2k images. There is not a chance in hell it has enough anime knowledge from 1k images. A single popular artist has more images than that. Waste of time.
>>
>>108036152
Stop using a 1070 to gen.
>>
>>108036158
what else
>>
File: z-image_00237_.png (1.4 MB, 944x1280)
1.4 MB
1.4 MB PNG
>>108036152
>>108036158
how long you dudes take to generate an image?
I use a 4090 and for like 50 steps deis it takes 1 minute and 14 seconds
>>
File: 588800128231875.png (930 KB, 832x1248)
930 KB
930 KB PNG
>>
>>108036173
5x
>>
who do I prompt these tits?
>>>/gif/30197486
>>
File: vds4.png (3.14 MB, 2432x1664)
3.14 MB
3.14 MB PNG
>>
now testing anima. it knows pepe, neta doesn't. some anatomy problems with hands, it really likes adding text and multiple views, but I haven't worked on the negs yet.

reminder you can use
>%UNETLoader.unet_name%
as your save file path
>>
File: ComfyUI_temp_ljghs_00056_.png (3.6 MB, 1824x1248)
3.6 MB
3.6 MB PNG
>>108036173
I use a 5090 and it takes about a minute to generate a 1216x832 pic, 2m10s to gen a 1920x1080 pic
https://files.catbox.moe/qx6weh.png
>>
File: 1746444050734059.png (3.06 MB, 1184x1776)
3.06 MB
3.06 MB PNG
>>
File: 1761882611680898.png (479 KB, 763x1144)
479 KB
479 KB PNG
>>108035877
so far, the only thing that works for me is making a wan video and using it as guidance frames for ltx.

If the subject/s are too distant, it will get sloppy and start producing ltx face. Same if the subject/s move too much, twirling and dancing. Close-up portraits are better. The IC detailer is kinda shit but the static camera lora helps. Either way, loras make a big difference in quality. Experiment with enabling and disabling them, some loras do fine at 0.3 but will kill the output at 0.8.

More frames per second, higher resolution helps. If you have a 5090 or better, you can skip the upscaler and just gen at higher resolutions. Using more steps is hit or miss but it can sometimes help. I haven't experimented much but lcm at cfg 1.0 works OK. Though I'm probably going to attempt higher cfg with negatives to see if that improves things. LTX2 has a lot of problems but if you have the patience and time to tweak it for a particular i2v, it's usable
>>
>>108036144
ah interesting, thanks. what system prompt do you use on that agent?
>>
>>108036173
not who you replied to, but I'll jump in. 50 steps on my 4070 ti super is roughly xxminxxsec. I'm using Res MultiStep too.
>>
>>108036226
"You specialize in crafting high-quality prompts for image generation tools. You take user-provided descriptions or reference images and transforms them into structured, detailed prompts optimized for AI image generators. Include information about lighting. Include information about camera angle. Do NOT mention the image's resolution. Include information on the image's composition style, such as leading lines, rule of thirds, or symmetry. Specify the depth of field and whether the background is in focus or blurred. If applicable, mention the likely use of artificial or natural lighting sources. Do NOT use any ambiguous language. Include whether the image is sfw, suggestive, or nsfw. If it is a work of art, do not include the artist's name or the title of the work. Do NOT use polite euphemisms—lean into blunt, casual phrasing. Include information about the ages of any people/characters when applicable. Mention whether the image depicts an extreme close-up, close-up, medium close-up, medium shot, cowboy shot, medium wide shot, wide shot, or extreme wide shot. Explicitly specify the vantage height (eye-level, low-angle worm’s-eye, bird’s-eye, drone, rooftop, etc.). Your response will be used by a text-to-image model, so avoid useless meta phrases like “This image shows…”, "You are looking at...", etc."
>>
>>108036248
forgot to fill out my times :(
1 min 30 seconds. I stick to 30 steps mostly.
>>
File: 00120-1998280877.jpg (1.71 MB, 2048x2688)
1.71 MB
1.71 MB JPG
That new Z model seems better
>>
>>108036256
She looks BURNED.

I'm WHITE.
>>
>>108036220
how do you gen long video with wan + ltx2? wouldn't wan make weird motion between 5 sec clips?
>>
>>108036220
this looks very comfortable
>>
>>108036270
svi lets you string 4 5 second vids together. There's a workflow somewhere on civitai
>>
>>108036287
then why do you need ltx2?
>>
File: 1747667293108336.gif (38 KB, 638x640)
38 KB
38 KB GIF
>make a bunch of wan2.2 gens
>looks weird
>didn't realise I had one of the low model loras set to half strength
FUCKING TWO MODEL HIGH-LOW BULLSHIT
>>
>>108036256
better at anatomy, with much increased gen times as a cost. Use it if you need it for proffessional productions and don't have the time/patience for post-production
>>
File: ga6.png (2.57 MB, 960x1312)
2.57 MB
2.57 MB PNG
>>
>>108036302
to add audio and interpolate
>>
wow so original
>>
>>108036315
I see.
wan can't lip sync
>>
What's everyone's params for z-image base?
>>
>>108036173
My current workflow takes 15 minutes with a 3090 and 64gb of ram. I usually just queue up a bunch of prompts overnight
>>
>>108036327
My catbox links have the img with my workflow attached; basically it's a basic zib + hi-res fix pass wf
>>
>>108036365
>15 minutes
wtf are you talking about
>>
>>108036311
this nigger got triggered lmao
>>
>>108036374
It's not t2i, it's i2i with a lot of input images using slow stuff like multistep samplers, clownshark guides, and multiple conditioning
>>
>>108036383
output example?
>>
File: Flux2-Klein_00511_.png (382 KB, 544x544)
382 KB
382 KB PNG
>>108036383
show us, get yer nodes out for the lads
>>
>>108036365
you need 15 minutes for 1 image with a 3090?
>>
SongMuLA claims to be genning. We'll see if it successfully vae's though.
>>
>>108036173
I'm using that newer model and it takes me : 5 min. 47.3 sec for 100 steps with a high res pass at 2x
>>
>>108036401
You should only be comparing iterations / second:
(assuming no speedups like torch)
>model & version
>changed prompt?
>cfg/nag
>sampler
>obviously the image size
>>
>>108036383
buddy these gens better be kino and nectar from heaven with that gen time
>>
>>108036427
Flux.2 is big, could be that.
>>
>>108036206
its funny that this pic is lewd to pedophiles because of the black and white / noir elements but otherwise it would be completely innocuous to see the exact same thing IRL since it would be in color
>>
File: f3r.png (2.44 MB, 2048x2048)
2.44 MB
2.44 MB PNG
>>108036381
seek meds and bee bee sea treatment at once!
>>108036391
neat
>>
>>108036427
deep down you know it's 1girl
>>
>>108036436
I have used unquantized flux2 with a 3090. it does not take that long
>>
RACISM IS GENERATING

*cackles*
oh yes racism is soon
>>
>>108036391
>>108036394
>>108036401
>>108036427
The gens are nothing special, they just use many conditioning images from my custom 3D renderer, I'm sorry I'm not gonna share images. They need to be perfect and match eachother because they are used as keyframes to an animation workflow
>>
>>108036440
yeah i guess so. never seen it as possibly being lewd, but now that you mentioned it does seem a tad intimate,sensual
>>
>>108036448
You didn't, go away retard.
>>
File: 1759929229062293.mp4 (1.11 MB, 720x1072)
1.11 MB
1.11 MB MP4
>>108036184
>>
>>108036454
bait
>>
>>108036441
But they smell and are gay.
>>
>>108036462
k moron
>>
curse vishnu
>>
File: 00133-567300872.jpg (1.48 MB, 2048x2688)
1.48 MB
1.48 MB JPG
>>108036152
The new model only takes 5 minutes on a 2 pass and adetailer run
>>
Prompt executed in 349.40 seconds
>>
>only 5 minutes
is ran for real? 5090 btw
>>
what gen time can I expect for z image base if I get a 5070ti?
>>
File: r8zxi81lz5781.jpg (79 KB, 1080x1137)
79 KB
79 KB JPG
>>108036460
>never seen it as possibly being lewd, but now that you mentioned it does seem a tad intimate,sensual
it has similar vibes to something like Therese Dreaming by Balthus, juxtaposing an otherwise innocent situation with an eroticized perspective


while looking for an appropriate image I found that Maya Hawk (from Stranger Things) recreated it for a music video or something which is cool since I was going to probably make something similar with Qwen Image Edit or something
>>
ok what are anons using to prevent random text and multiple views?
>>
File: 1769993033893146.jpg (1.07 MB, 2432x1664)
1.07 MB
1.07 MB JPG
>>108036543
It may be my porn-fried brain speaking, but it is just a girl sitting to me
reposting this one since I borked my ComfyUI and now I need to redownload the models fml
https://litter.catbox.moe/bmu4wcoi474tfucx.png
>>
>>108036559
Using a better model.
>>
File: z-image_00031_.png (966 KB, 800x1280)
966 KB
966 KB PNG
>>108036125
100%. it's like getting worked up about a bad tweet; who cares. anyone can put some shit on the internet. it isn't more official because it has a huggingface repo and a card. some misguided kid wants a career in AI shit. it's fine.
>>108036155
bullshit. i have a 200k booru dataset that covers the top 5000 characters, and every tag and style otherwise, quite generously. you don't need a huge dataset. when you're feeding back weight adjustments over 200k samples, how much uniqueness do you think you need? it already has the general knowledge. come on
re: the pic - it's borked but i like it. the signature is so good
>>
File: ComfyUI_00303_.png (906 KB, 1024x1024)
906 KB
906 KB PNG
>>
File: da3.png (1.81 MB, 1008x1024)
1.81 MB
1.81 MB PNG
>>
File: 169969-tmp.png (3.17 MB, 1368x2000)
3.17 MB
3.17 MB PNG
>>
>>108036598
>bullshit. i have a 200k booru dataset
Absolutely nothing.
Illustrious v2.0 was trained on 20M images. If you can't do even a fraction of that then your finetune is a waste of electricity.
>>
>>108036598
You are better of turning them into individual loras.
>>
>>108036612
is that a crash bandicoot level?
>>
>gemma 3 12b abliterated for ltx2
snake oil?
>>
>>108036220
>More frames per second, higher resolution helps. If you have a 5090 or better, you can skip the upscaler and just gen at higher resolutions. Using
It's really slow then, isn't it?
>>
>>108036639
>>108036650
You guys have never done any serious finetuning. I don't care how many images Illustrious was trained on. The results of my training have only been positive. These anime models output cooked shit anyway. Fuck off
>>
>>108036692
it's not unreasonably slow. Still faster than wan
>>
>>108036707
neither did tdrustled
>>
>>108036711
I'll have to try then, I thought the spatial upscaling was mandatory for the model.
>>
File: 1757423467553860.mp4 (2.77 MB, 1024x1024)
2.77 MB
2.77 MB MP4
>>
>>108036735
if you wanna go crazy, you know how the initial sampling is at 50% the size of your final resolution? Don't downscale, so that the longest side, for example, ends up being 2048 lol. The quality is certainly better but the gen takes 5+ minutes - it works on my 5090
>>
File: 1756767065549600.mp4 (1.09 MB, 720x1056)
1.09 MB
1.09 MB MP4
>>108036631
>>
>>108036707
>These anime models output cooked shit anyway.
Then why are you even doing it? Either commit fully or fuck off.
>>
File: 5yuii.png (1.4 MB, 1920x1200)
1.4 MB
1.4 MB PNG
>>108036658
it's a FAIL t2i of a Quake hallway. The real fail is trying to then wrangle latent upscale to work without shitting it up
>>108036751
>>108036766
neat
>>
>>108036755
Can you share your wf anon? I'll try that tonight on my 5090.
5min is nothing, I've been genning the first video models for 2h using a 3090.
>>
>>108036755
>2048
you got oom?
>>
It's ironic that the same people that complain about "bloated" models also complain about "overcooked" models
>>
>>108036802
It's almost as if the faggot is priced out of using modern models and wants to take it out on others
>>
File: ComfyUI_temp_juqfx_00005_.png (3.04 MB, 1824x1248)
3.04 MB
3.04 MB PNG
https://litter.catbox.moe/62n8kcn6pmgwvsti.png
>>
File: 4Way.jpg (3.34 MB, 2048x2048)
3.34 MB
3.34 MB JPG
sorry went out for a bit
>>
>>108036829
woops meant to reply this to:
>>108036132
>>
>>108036734
ching chong ping pong
>WE SPEND MOST GPUYOUOWA TO FINETUNE MODUR
>>
I need 128 gb ddr4 under $250
>>
>>108036559
its kind of the wild west right now. Setup a NAG node in codex or CC, it helps a lot with negative enforcement.
>>
>>108036540
ackk
>>
This workflow is fucking crazy. I did not know lTX could look this good.
https://litter.catbox.moe/bqiinka0sn8h7ybf.json

https://litter.catbox.moe/m2ryke6uybhfnqpd.mp4
https://litter.catbox.moe/4v6dubcoq7ygzc27.mp4
https://litter.catbox.moe/dclzhivv12hjlshg.mp4
https://litter.catbox.moe/gb8lhopsbaek4mr4.mp4
>>
>>108036440
nobody was talking about this shit until you brought it up. i think it is you who is le pedophile good saar
>>
File: 1754067062761790.png (21 KB, 506x130)
21 KB
21 KB PNG
>>108036540
I dunno but about 3 minutes on a 5060 ti using comfy template and 50 steps. comfy has it set it to 25 though
>>
>>108036155
right nobody was unironically saying it was good
>>
>>108036859
I was thinking the same thing. what a bizarre takeaway from that image
>>
>>108036688
"muh censored text encoder" in the context of any diffusion model is always nonsense technically speaking yes, it's not a thing at all
>>
>>108036854
>girl in shower "you like these tiny tits"
>they're not tiny
disappointing
>>
>>108036854
>slopped
>good
>>
catbox sucks fat donkey dick
>>
>>108036854
Great job, and great stuff. This is the exact workflow I need to create a short film! Shame about LTX-2's tin-can audio quality though
>>
>>108036791
it's just the ltx 2.0 Distilled template in comfy with the 'upscale image by' node changed to 1.0. Also, the LTXVNomalizingSampler is supposed to be better than the standard one.
>>
>>108036880
I'm talking about fidelity, not the content of the video. Fidelity wise it blows away wan 2.2 now
>>
>>108036885
y
>>
>>108036885
yeah so fucking slow
>>
>>108036888
OK.
>>
>>108036854
needs more erotic stuff, this asmr is boring
>>
>>108036885
yeah; right now the main catbox site fails silently, with all generated links 404'ing on you; litter seems ok for now
>>
File: 1767970858576089.png (27 KB, 709x385)
27 KB
27 KB PNG
>>108036896
>>
>>108036639
quality>quantity thougheverbait, unfortunately
>>
File: afewafewfewfewfew.jpg (331 KB, 1279x957)
331 KB
331 KB JPG
>>
>>108036918
aackk
>>
File: ComfyUI_temp_juqfx_00015_.png (3.45 MB, 1824x1248)
3.45 MB
3.45 MB PNG
>>108036918
Try opening on an anonymous tab, or use another browser. I can access litter just fine in Brave
https://litter.catbox.moe/qhqw4tu3hugsef6s.png
>>
>>108036945
I think I got rate limited or some shit because I opened all four links at once. They started to load then stopped.
>>
>>108036409
>SongMuLA
Ace step comes out February 3rd and will blow that shit out of the water.
>>
governments must have a local model that is at least nbp tier capabilities but fully uncensored
>>
>>108036988
How do you think they'll make the deepfakes for arresting opposition?
>>
File: ComfyUI_temp_juqfx_00021_.jpg (839 KB, 2880x1632)
839 KB
839 KB JPG
>>108036988
They might have nbp running on a supermachine somewhere, or at least a dev/experimental fork but running on the same infra as the consumer one does otherwise; we know that openai has a special version for the military for example. That version is for sure less censored. You're onto something here
https://litter.catbox.moe/28hb50ygxtee4ilj.png
>>
>>108036988
I bet that's what is keeping SAI afloat right now
>>
>>108036854
>m2ryke6uybhfnqpd.mp4
ouch my ears
at least it cant get worse from here

>>108036988
>governments must have a local model that is at least nbp tier capabilities but fully uncensored
a reasonable assumption knowing obvious allegiances is that chinese government can do whatever they want with literally any chinese SOTA model, and american government can probably do whatever with Grok's best models. so still not really good enough yet but in a few years video footage probably isn't going to cut it
>>
>>108037055
we'll just have AI forensic experts
>>
>>108036988
Yeah, it's called NBP
>>
how do I unslop ltx2 audio? they all sound like a system
>>
>>108036206
Using your workflow inference on an RTX 6000 is about 42 seconds with 2.2 it/s on the first 40 and 1.2 s/it on the last 20 steps.
>>
>>108037104
wait for 2.1 / 2.5
>>
>>
>>108037104
https://github.com/haoheliu/versatile_audio_super_resolution can work for some improvement
>>
>>108037166
example data https://audioldm.github.io/audiosr/
>>
File: 00185-4226207588.jpg (1.24 MB, 2048x2688)
1.24 MB
1.24 MB JPG
>>
>>108037106
2.2 iterations a second? holy
>>
>>108037185
oh god i'm itoooratinnnng ag ahahahh
>>
can somebody fix fp16 computing for all these bf16 models? (anima, z image base)
>>
Illustrious 0.1 needed 20 million anime images and still came out suboptimal and unstable. So making character lorad for bigger non anime models will require tons of data, not to mention full anime finetuning on Klein or ZiT. The character loras I downloaded for ZiT are super unstable, and when the character looks right, the pose is stiff like SDXL or even SD 1.5.
>>
>wasting 5090 for this?
>>
File: ComfyUI_temp_dfcjh_00002_.png (2.64 MB, 1120x1600)
2.64 MB
2.64 MB PNG
>>
File: 00001-2418188579.jpg (947 KB, 2048x2688)
947 KB
947 KB JPG
>>
>>108036988
pretty sure jeets are jerking off to their uncensored ndp lewd gens at google headquarters. personally i hate using nanobanana pro for 1girl gens. qwen, klein and sdxl illustrious are better suited for that.
>>
>>
File: ComfyUI_temp_dfcjh_00004_.png (2.33 MB, 1760x1120)
2.33 MB
2.33 MB PNG
>>
>>108035985
hamster lookin face
>>
>>108036598
Your model is trash if this is your best sample.
>>
File: ComfyUI_temp_dfcjh_00005_.png (3.46 MB, 1760x1120)
3.46 MB
3.46 MB PNG
>>
File: 00010-2017891920.jpg (1.15 MB, 2048x2688)
1.15 MB
1.15 MB JPG
simple message really bothers some anons
>>
>>108036468
yes this is good
>>
>>108036441
skyrim lora?
>>
Need a good I2V wf for ltx2.

Character likeness falls apart after few frames in all the one I've ordered
>>
>>108037276
put the bunny, in the box
>>
File: ComfyUI_temp_dfcjh_00006_.png (3.76 MB, 1760x1120)
3.76 MB
3.76 MB PNG
you're all untalented
>>
>>108037335
>>108036854
use the I2V lora
>>
File: ComfyUI_temp_dfcjh_00007_.png (2.93 MB, 1120x1600)
2.93 MB
2.93 MB PNG
>>
KEKSTONE!!! WHERE ARE MY ARTIST TAGS!!!!!
>>
File: 00047-3346403117.png (2.55 MB, 1248x1824)
2.55 MB
2.55 MB PNG
>>108037341
i resent the implication I give a shit
>>
>>108037368
He must have it for the klein tune right?
>>
>>
>>108037382
SEXO
SEXO
SEXO
SEX RIGHT NOW
>>
>>108037382
Give her a belly bulge.
>>
>>108037382
give her a futa bulge
>>
>>108037275
shit don't look that Max saaar
>>
File: 1743615814611994.png (1.51 MB, 1024x1280)
1.51 MB
1.51 MB PNG
>>108037341
rude!

>>108037395
put a baby in her. it would be really funny haha
>>
>>108037199
that's because niggerfaggots keep doing retarded no caption loras despite it leading to the same kind of rigidity on every model arch that ever has or ever will exist
>>
comfyui is such a piece of shit
and it gets even more shittier with each git pull
>>
>>108037390 (me)
I apologize for losing my cool.
>>
>>108037428
then use something else and stop complaining?
>>
>>108037445
there is nothing else
>>
File: kl7o.png (2.38 MB, 2048x2048)
2.38 MB
2.38 MB PNG
>>108037328
No lora, but close. It's a t2i of a prompt I got from feeding Qwen3 32B a Skyrim screenshot.
>>108037454
Auto abandoned us
>>
>>108037457
that's remarkably good for a t2i
>>
>>108037341
accck
>>
File: f2k9b_00017.png (1.7 MB, 960x1536)
1.7 MB
1.7 MB PNG
some of you should consider self harm
>>
>>108037492
no pretty lady don't jump
>>
i still can't figure out why my workflow starts lagging after a few gens when doing XL. refreshing the browser completely fixes it. it has to be the face detailer nodes. what puzzles me is how no one has reported it as an issue in the github. it cant just be me
>>
File: ComfyUI_00405_.mp4 (795 KB, 640x832)
795 KB
795 KB MP4
>>108037382
>>
>>108037382
weird. I was just wondering where the asuka posts have gone
>>
File: f2k9b_00020.png (2.18 MB, 960x1536)
2.18 MB
2.18 MB PNG
>>108037497
too late
>>
File: ComfyUI_00406_.mp4 (787 KB, 640x832)
787 KB
787 KB MP4
>>108037492
>>
File: 1748223923283050.png (213 KB, 1176x254)
213 KB
213 KB PNG
zimagebros... when will the bf16 training be fixed?

>Conveniently, the fp32 weights for Z Image appear to have "leaked":
https://huggingface.co/notaneimu/z-image-base-comfy-fp32
>>
File: ComfyUI_00407_.mp4 (836 KB, 640x832)
836 KB
836 KB MP4
>>
File: 00029-293359364.jpg (1.32 MB, 2048x2688)
1.32 MB
1.32 MB JPG
>>108037537
Model seems alright
>>
I was gonna train coom lora for Klein but maybe now I don't need to, this one seems good:
https://civitai.com/models/2357212/face-cumshot
>>
>>108037537
>TWO MORE MONTHS UNTIL THE ACTUAL Z IMAGE RELEASES. GET HYPED FOR BASED CHINA!!!
>>
File: f2k9b_00024.png (1.98 MB, 960x1536)
1.98 MB
1.98 MB PNG
>>108037548
i don't think you even need a lora for cum.
>>
>>108037569
it does white liquid that is similar but clearly not cum, by default, the lora is definitely bridging that gap
>>
>>108037579
>Cum detective
grim
>>
>>108037583
wat
>>
>>108037583
>slop eating jeet
>>
File: 00040-622561523.jpg (922 KB, 2048x2688)
922 KB
922 KB JPG
>>
>>108037612
m'booba
>>
File: ComfyUI_09472.jpg (1.93 MB, 2160x1440)
1.93 MB
1.93 MB JPG
>>108037305
Jenny is my HamHam!
>>
>>108036598
wtf is going on in this
>>
>>108036559
multiple views is literally an actual booru tag. also probably like watermark, signature, logo, artist name, web address, etc
>>
>>108037616
quit samefagging
>>
File: z-image_00039_.png (1.65 MB, 960x1536)
1.65 MB
1.65 MB PNG
prompting jizz: https://files.catbox.moe/wbntmw.png
>>
>>108037629
was this supposed to be jennete mccurdy this whole time???? lmao
>>
>>108037648
i'm for sure not the guy using what looks like some A1111 variant based on the filename i promisee
>>
>>108037649
sloppa
>>
>>108037656
get better taste then
>>
>>108037545
>>108037612
complete slop
>>
>>108037668
Go cry LMAO
>>
>anima requires @ for artist tags

No wonder it didn't work the other day..
>>
File: 00044-1121034170.jpg (1.15 MB, 2048x2688)
1.15 MB
1.15 MB JPG
>>108037668
>>
File: ComfyUI_00328_.png (726 KB, 1024x1024)
726 KB
726 KB PNG
>>
I love experimenting with new models.
>>
File: 237027814205585.png (1.46 MB, 1232x1696)
1.46 MB
1.46 MB PNG
>>
remember when /ldg/ was hyped for RadialAttn and then everyone just forgot it existed when it released and it was never mentioned again
>>
>>108037703
>I AM SILLY!
never really liked strawman comics like this. normally means the ridiculed party is correct. comfy does suck btw
>>
comfyui sucks because i am too poor to use api nodes
>>
File: file.png (132 KB, 1385x1203)
132 KB
132 KB PNG
>>108037723
You're entitled to your opinion. Just would be nice if you didn't shit up the threads with it for attention.
>>
>>108037722
because it had draconian length and resolution requirements or some shit but I think they might have fixed that
>>
>>108037548
it's meh
>>
I have no implementation and I must refine
>>
>>
>>108037733
are you ok catjak?
>>
>>108037703
That's a nice style. Is it based on some particular artist?
>>
>>108037774
harada takehito
>>
File: 00054-1120702512.jpg (1.02 MB, 2048x2688)
1.02 MB
1.02 MB JPG
>>108037763
>Anyone negative to him is the same guy
Can't stop exposing yourself eh?
>>
new /ldg/
>>108037746
>>108037746
>>108037746
>>
>>108037776
Thanks
>>
File: file.png (1.81 MB, 1152x864)
1.81 MB
1.81 MB PNG
>>
File: file.png (1.75 MB, 1152x864)
1.75 MB
1.75 MB PNG
>>
File: file.png (1.32 MB, 1152x864)
1.32 MB
1.32 MB PNG
>>
File: file.png (1.9 MB, 1152x864)
1.9 MB
1.9 MB PNG
>>
File: file.png (1.92 MB, 1152x864)
1.92 MB
1.92 MB PNG
>>
File: file.png (3.87 MB, 1728x1296)
3.87 MB
3.87 MB PNG
>>
File: file.png (1.87 MB, 1152x864)
1.87 MB
1.87 MB PNG
>>
migrate
>>108037746
>>108037746
>>108037746
>>
>>108036540
45-60secs for me at 1024x1024



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.