[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: collage.jpg (1.65 MB, 2888x2841)
1.65 MB
1.65 MB JPG
Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>107517471

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>WanX
https://github.com/Wan-Video/Wan2.2

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2298660
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe|https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
>generated 189 image long concert image set
>going through it, thinking "gee what's with random rocks in every other image"
>I had "rock band" in the prompt
>>
>>107521131
>Maintain Thread Quality
>https://rentry.org/debo
>https://rentry.org/animanon
why do you leave this shit in here and not rename comfyui to MumbaiUi?
>>
File: ComfyUI_temp_lhnsy_00029_.png (3.23 MB, 1344x1856)
3.23 MB
3.23 MB PNG
>>
File: wan_00039.mp4 (3.34 MB, 880x688)
3.34 MB
3.34 MB MP4
>>
>>
newest prank to ruin people's remote comfy instances is to update
>>
>>107521170
damn that's good
>>
>>107521163
too old she was hotter as a teen desu
>>
wow this thread went to shit
>>
>>107521185
After the last one, I'm not updating until I really need it. I've gotten used to the new button layout now, if they roll it back that will once again be annoying to me
>>
>>107521196
Because I haven't been posting many gens. Sorry.
>>
File: _w_00041.mp4 (1.56 MB, 864x688)
1.56 MB
1.56 MB MP4
wan, that's not what a cheeto looks like
>>
thinking of making a thread split with explicit no drama rule. what do I call the general?
>>
>>107521170
Creative, but a missed opportunity to lewd
>>
>>107521210
can we just use this thread and just report rannigger if she dramabakes again?
>>
thinking of making a poop with explicit no shit rule. what do I call the defecation?
>>
File: ComfyUI_01654_.png (1.28 MB, 1400x1024)
1.28 MB
1.28 MB PNG
/ldg moment right here
Forget the bots
>>
>>107521231
When he bakes he doesn't collage he uses his own gen for OP. You're fighting a ghost.
>>
>>107521231
>Maintain Thread Quality
report ranfaggot bakes. happy gooning

should be the replacement then
>>
holy fuck bros I'm so fucking happy.. just generated my first video.. used a girl I used to know as image... holy fuck this is so good
>>
>>107521249
for not sharing your next gen will oom
>>
>>107521239
she doesn't post images anymore because she was caught samefagging the low effort slop
>>
reply to this post or your 1girls will forever have 6 fingers
>>
>>107521239
actually ranfaggot just makes shitty collages
>>
>>107521265
How would I know the difference?
>>
File: pixart sigma twinflow.png (558 KB, 902x674)
558 KB
558 KB PNG
behold, the first Pixart-Sigma image generated with time_text_embed_2 (Twinflow)

of course, there's not much to look at because Pixart-Sigma isn't trained to use twinflow. now I need to find a dataset..
>>
>>107521306
>fagmojis
>>
File: z_00033_.png (1.48 MB, 1280x1024)
1.48 MB
1.48 MB PNG
>>
File: ComfyUI_temp_lhnsy_00040_.png (3.89 MB, 1344x1856)
3.89 MB
3.89 MB PNG
>>
>>107521343
kek
>>
>>107521163
>arm hair
>>107521193
>she was hotter as a teen desu
most girls are. i dont even know who this is i thought this was someone continuing the jewessposting
>>
>>107521343
Lmao. One should be xir hanging on a noose
>>
>>107521343
holy based
>>
>>107521333
On the terminal even. It's the end of times
>>
>>107521343
ummm one of the schizos broke out of containment
>>
File: top_of_the_pops.png (1.81 MB, 1536x1152)
1.81 MB
1.81 MB PNG
Jimmy Savile.
>>
>best way I can describe using comfyui moving forward is like sitting on a 12" dildo leaving it in and saying "fuck it, I'm gay now" instead of pulling it out with some dignity and saying "what's the next steps"
>>
File: ComfyUI_temp_lhnsy_00048_.png (3.92 MB, 1344x1856)
3.92 MB
3.92 MB PNG
how you guys get rid of the bokeh?
>>
File: file.png (3.1 MB, 2048x1244)
3.1 MB
3.1 MB PNG
>>107521469
you use NAG
https://www.reddit.com/r/StableDiffusion/comments/1pbrbrt/nag_normalized_attention_guidance_works_on_zimage/
>>
File: spongebob-patrick.gif (183 KB, 220x220)
183 KB
183 KB GIF
>>107519747
in a sea of echoing retards i appreciate this information, anon. it puts my mind at ease.
>>
File: 1755709875789631.png (2.32 MB, 1088x1728)
2.32 MB
2.32 MB PNG
i added outfits to a total of 900, but pastebin doesnt allow me to share them because it includes nudity or something.
>>
>>107521518
>>107519747
>Everyone crying for base right now is basically asking for the model that is the leftmost column in picrel to be released.
who said that though? you retards assume anything
>>
how good are these models at turning famous video game characters into photorealistic versions?
>>
>>107521524
Just catbox the txt or use justpastit or any other paste

>>107521554
Worst than nano banana if you're interested in sfw, and they're your only option if you're interested in nsfw so the quality doesn't matter
>>
File: ComfyUI_temp_czaly_00008_.png (3.35 MB, 1600x1600)
3.35 MB
3.35 MB PNG
>>107521473
I'm trying it but it doesnt work, snake-oil
>>
>>107521590
go for
>blur, background blur, bokeh
and on the positive prompt add that it's "a sharp image"
>>
File: onetrainer.png (7 KB, 432x181)
7 KB
7 KB PNG
For some reason OneTrainer cant load this despite diffusers being downloaded there as well
>Loading of single file Z-Image models not supported. Use the diffusers model instead. Optionally, transformer-only safetensor files can be loaded by overriding the transformer.
what the hell
>>
File: ComfyUI_temp_czaly_00023_.png (2.87 MB, 1344x1856)
2.87 MB
2.87 MB PNG
>>
File: ComfyUI_temp_czaly_00029_.png (3.73 MB, 1344x1856)
3.73 MB
3.73 MB PNG
>>
File: ComfyUI_temp_czaly_00032_.png (2.8 MB, 1344x1856)
2.8 MB
2.8 MB PNG
>>
>everyone is busy watching the game awards
>>
File: ComfyUI_temp_czaly_00037_.png (2.44 MB, 1344x1856)
2.44 MB
2.44 MB PNG
>>
File: ComfyUI_temp_czaly_00038_.png (3.87 MB, 1344x1856)
3.87 MB
3.87 MB PNG
>>
File: z-image_00002_.png (1.23 MB, 1024x1024)
1.23 MB
1.23 MB PNG
Z is like 30% faster on linux than windows for me. 6.2s versus 9.5s
Wonder what it could be
>>
What ComfyUi workflow do I use for text to img
I was using Forge up to now
>>
>>107521763
windows using more of your vram for desktop and other things?
>>
>>107521763
Do any speed boosts work on linux? I got a spare drive and want to test comfy on linux with potentially moving everything over from windows. Only issue is my rtx card, which I'm sure will be fun to set up..
>>
>>107521608
Yeah it want's the 'Tongyi-MAI/Z-Image-Turbo' repo, I would prefer it as well if you could just use this file.

Not a big issue though.
>>
>>107521790
google comfyui examples it's connected to the repo you're welcome
>>
>>107521763
Linux is faster in general and in particular for AI since everything AI is developed on Linux, but 30% sounds too much.
>>
>Error: Server did not start in time
always the problem i had with these comfyui custom node implementations of llm, setting environment variables does not help much.

Yeah fuck it, poorly done, so disappointing.
>>
only loads half the time, is it just timing out or does it not have enough vram, well it has enough vram so it can't be that.

thanks for nothing.
>>
and its fucking garbage anyway when it works as none of the prompts are useful to the image models unless you're using flux or something.
>>
Why are you always so angry? You're not even doing anything productive here
>>
>>107521763
Are you launching with --use-pytorch-cross-attention? If not, that should help a bit. Don't know if it closes the gap or just helps both, though.
>>
>>107521910
just like how everyone bitches about cumfart but does nothing to make an alternative?
>>
>>107521910
because fuck you that's why, now go back to geggit.
>>
File: z-image_00010_.png (3.18 MB, 1280x2048)
3.18 MB
3.18 MB PNG
>>107521804
>windows using more of your vram for desktop and other things?
possibly, but i have pretty minimal installs on both

>>107521814
>Do any speed boosts work on linux?
All speed boosts work on linux, windows is the one that has some that dont work for it

installing nvidia drivers is pretty easy on anything ubuntu or arch based

>>107521913
i am not, just with --fast which shouldnt even matter since the models are bf16 not fp16

>Don't know if it closes the gap or just helps both, though.
i will check on linux right now
>>
>>107521608
You're missing some files or they're misnamed.
>>
>>107521790
Templates -> Getting Started -> Image Generation. When it opens, it'll list a base SDXL model that's missing, but you can just close that and choose to load a model you actually want to use. Safetensors files go in the models -> checkpoints subfolder inside the ComfyUI program folder.
>>
>>107521210
I think we should call it /ldgbwtd/ (/ldg/ but without the drama)
>>
File: 1742985168158027.png (2.76 MB, 1280x2048)
2.76 MB
2.76 MB PNG
use-pytorch-cross-attention seems to be slightly slower for me on linux actually, 25ish seconds with cross attention vs 24ish or less without it
>>
>>107521953
>/ldgbwtd/
what are we? faggots or something?
>>
>>107521733
I have it muted on second screen it's way too cringe otherwise
>>
>>107521932
>All speed boosts work on linux, windows is the one that has some that dont work for it

Sweet. Is there any clear guides on how to properly setup comfy with an rtx card? I'm sadly not linux savvy. I did try early this year when wan first released, only managed to generate with sdxl but it would crash on anything video related. So I follow 5 separate guides with 5 fresh separate installs, some say just use default drivers, some say use *this command with this driver version*, etc but no luck sadly.
>>
>>107521961
OK; I'm on AMD (and an iGPU at that), so optimal settings may differ.
>>
>>107521997
please dont give advice if you have an AMD card without explicitly mentioning that first next time thanks

>>107521994
>Is there any clear guides on how to properly setup comfy with an rtx card? I'm sadly not linux savvy.
There aren't, because there's no clear guide needed. You install the nvidia driver, you install CUDA toolkit for your version of cuda. You install torch for the version of python and cuda you're using, and then you install ComfyUI's requirements.txt

and then you clone and install sageattention

and then you're done. Seriously that's the entire tutorial. I guess I could include it in the WAN rentry I've been talking about making for a month now if i ever get around to it
>>
>>107520971
I thought they said this was going to be vue based
So they straight up fucking lied
dragged and shot. Comfy fire these retards
>>
>>107522055
>So they straight up fucking lied
they've been doing that for a year now and you catch on now?
>>
Where does everyone get celebrity LoRA from now that Civitai is cucked?
>>
>>107522157
you train it yourself
>>
>>107522162
on a 3060 12GB? why do you hate me, anon?
>>
File: white.png (3.39 MB, 1824x1248)
3.39 MB
3.39 MB PNG
>>
>>107522170
3060 12GB is more than enough if it's an XL lora
>>
>>107522170
I had that card and I made do, although I preferred wan for t2i and it was real good with loras
z's probably viable but it aint' worth until [redacted] drops
>>
>>107522157
https://huggingface.co/malcolmrey/zimage/tree/main
a few here
>>
>>107522157
a z-image character lora takes 1 hour to train.
>>
>>107522239
holy shit, that's a big list. thanks, anon! <3
>>
>>107522270
local or by cloud
>>
>>107522270
how?
>>
>>107522292
local
>>
>>107522295
ig i gotta install 1000 python packages
>>
>>107522294
ai toolkit. train at 768 and rank 16, 2000-2500 steps
>>
>>107522302
does it come out looking like hot garbage? got any examples?
>>
File: 1747941955869431.png (2.53 MB, 1088x1728)
2.53 MB
2.53 MB PNG
>>107522311
nah it's fine. combining loras is broken through, but hopefully this is fixed if a base is ever released
>>
>>107521590
you need this repo https://github.com/BigStationW/ComfyUI-NAG
the others don't work
>>
>>107522393
comfy doesn't even work, why bother?
>>
>>107522440
dilate your """"pussy"""
>>
>>107522458
projecting much?
>>
File: 000544.jpg (171 KB, 736x982)
171 KB
171 KB JPG
i hope im not wasting my time using Conceptual Captions as my dataset source. most images are under 1024x1024 so i hope that's not an issue

>>107522171
>white
>plains
now do blue black red and green on an island, swamp, mountain, and forest respectively
>>
>>107522302
768 resolution?
>>
Do you believe me about Chinese culture now?
>>
File: 1762950660064031.png (2.08 MB, 1344x1344)
2.08 MB
2.08 MB PNG
>>107522528
yeah, 1024 trains way too slow with worse results. and dont just use 768x768 for everything. use taller images for portraits and even taller for full body shots etc.
>>
File: xjlaw50vzsu31.png (1.14 MB, 1400x5552)
1.14 MB
1.14 MB PNG
>>107522547
always have
>>
>>107522570
yeah. ima have to write a script for this shit.
>>
File: 1750854181777877.jpg (3.51 MB, 2200x2200)
3.51 MB
3.51 MB JPG
does anyone use these vaes?
https://huggingface.co/easygoing0114/Z-Image_clear_vae
>>
File: z-image_00793_.png (2.74 MB, 1152x2048)
2.74 MB
2.74 MB PNG
>>
File: z-image_00794_.png (2.62 MB, 1152x2048)
2.62 MB
2.62 MB PNG
>>
File: z-image_00795_.png (2.61 MB, 1152x2048)
2.61 MB
2.61 MB PNG
>>
File: z-image_00796_.png (2.88 MB, 1152x2048)
2.88 MB
2.88 MB PNG
>>
>>107522635
>>107522651
>>107522658
>>107522664
what have u cooked up here!
>>
File: tauren.png (2.36 MB, 1080x1511)
2.36 MB
2.36 MB PNG
you could be genning cool shit, but all you do is generate generic slop to coom to
>>
File: z-image_00797_.png (2.68 MB, 1152x2048)
2.68 MB
2.68 MB PNG
i have brain damage
>>
File: z-image_00798_.png (2.8 MB, 1152x2048)
2.8 MB
2.8 MB PNG
>>
is there a better/best guide when starting from 0? i'm technical but the amount of info out there and here is a lot and I don't know where to start. one of the anon guides? somewhere else? Aiming to gen realistic images/videos (if it matters)
>>
File: z-image_00799_.png (2.93 MB, 1152x2048)
2.93 MB
2.93 MB PNG
>>107522702
you just gotta mine at it
>>
>>107522671
damn thats cool
>>
>>
File: z-image_00803_.png (2.69 MB, 1152x2048)
2.69 MB
2.69 MB PNG
>>107522670
>>107522671
>>107522729
>>107522753
>>
>loras destroy fine details and adherence
>different seeds make practically no difference even with noise injection tricks
honestly, I'm kinda bored with zit at this point. It doesn't have much potential
>>
multigpu is still broken on latest comfy
>>
>>107522766
Zit immediately felt like a "hard" distill to me. There was no variation or room for interpretation on the output/input.

LoRAs fucked it because they obviously would being a distill.
Too bad we got cucked out of the base model too. People should stop looking so thirsty for models in the future and play it a little cooler.
>>
>>107522768
well no shit, it wasn't updated
>>
File: z-image_00804_.png (2.9 MB, 1152x2048)
2.9 MB
2.9 MB PNG
>>107522766
>>107522766
you don't have much potential
>>
File: z-image_00036_.png (2.86 MB, 1536x1536)
2.86 MB
2.86 MB PNG
>>107522671
>>107522729
>gemini logo in the corner
gr8 b8

>>107522687
>i have brain damage
same

>>107522702
>Aiming to gen realistic images/videos (if it matters)
You're in luck. Realistic image gen is the easiest it's ever been. You just need to Setup ComfyUI which if you're technical you'll be able to do, and download 3 things
https://comfyanonymous.github.io/ComfyUI_examples/z_image/

video is a lot more effort
>>
>>107522786
thanks for the confirm. I already downloaded comfy portable before your reply and funny enough was looking at the z-image sample workflows since it looked best to me. Need to find some old SATA cables to hook up an old SSD for all this stuff
>>
haven't genned anything today, the meds might be working
>>
>>107522858
>Need to find some old SATA cables to hook up an old SSD for all this stuff
uhh what gpu do you have old man?
>>
>>107522871
why I got my trusty ol' gtx 680
>>
>>107522871
>>107522884
lol new build, 5060ti 16gb, I just don't want to fill up my new m2 with this stuff.

but shit, might be showing my age with the parts I have laying around though..
>>
>>107522898
>I just don't want to fill up my new m2 with this stuff.
it's pretty much a requirement for model load times that aren't ass. you can use comfy off the SSD and shit images there but always have your model folder in the nvme
>>
I will buy the dgx spark this weekend. What should I test for?
>>
File: z-image_00081_.png (3.73 MB, 2048x1536)
3.73 MB
3.73 MB PNG
i feel like im the only ziggurat who consistently has issues with the amount of toes from behind. i think i'm overprompting for bare feet and toes

>>107522898
people still use sata SSDs all the time, they're fast enough for most use cases.

>>107522909
>you can use comfy off the SSD and shit images there but always have your model folder in the nvme
...oh that might be the reason for the 9.5 seconds versus 6.5 seconds on windows and linux. my linux is on nVME and my windows is on SATA ssd

but no the models are loaded entirely into my ram and VRAM that makes no sense
>>
>>107522916
mental problems? low IQ?
>>
File: 1757558407473884.jpg (34 KB, 660x615)
34 KB
34 KB JPG
>>107522916
>I will buy the dgx spark
>>
>>107522920
>but no the models are loaded entirely into my ram and VRAM that makes no sense
the models are SAVED on the nvme but when you need to LOAD them from disk it takes less time to get it in RAM so you can use it. it doesn't have to do with inferencing
>>
>>107522916
>I will buy the dgx spark this weekend. What should I test for?
>>>/g/lmg
if you don't know why I linked you there, then >>107522924

>>107522935
well this only matters on startup or if you have to offload a model like your text encoder to disk right? Z, the VAE and the TE all fit into my RAM+VRAM so i only see the very first gen after i boot up comfyui to take ~10 seconds longer
>>
>>107522960
You obviously don't know what dgx spark is worth.
>>
>>107522960
>well this only matters on startup or if you have to offload a model like your text encoder to disk right?
yes but zit is ~12gb that needs to get loaded so it takes a long time just to see the first image.
>>
File: zman.png (1.55 MB, 992x1552)
1.55 MB
1.55 MB PNG
>>107522753
Qwen image is still king.
>>
File: 1758114995988501.png (2.44 MB, 1080x1920)
2.44 MB
2.44 MB PNG
>>
File: z-image_00093_.png (3.15 MB, 1536x1536)
3.15 MB
3.15 MB PNG
>>107522983
>You obviously don't know what dgx spark is worth.
it's worth nothing to me, and if you posted about it in the context of /ldg/ its worth less to you than you think. what /ldg/ purposes are you planning on using it for?
>>
>>107523066
Dig through previous threads. I won't spoonfeed
>>
>>107523066
dirty 1girls best 1girls
>>
File: ComfyUI_temp_qario_00018_.png (3.16 MB, 1408x1792)
3.16 MB
3.16 MB PNG
>>107523046
kek
>>
>>107523046
lmao
>>
File: ComfyUI_temp_qario_00021_.png (3.48 MB, 1408x1792)
3.48 MB
3.48 MB PNG
>>
>>107523096
gods I want to sniff her farts
>>
File: z-image_00001_.png (1.06 MB, 1024x1024)
1.06 MB
1.06 MB PNG
>>107523066

thx anon and whoever the other guy was. Easiest shit ever. Just took the default prompt and modified it slightly to prove i'm not completely inept. :)


I was more referring to the e.g. s-video adapter I found in my parts box while putting this new build together. I know SATA is still current
>>
File: ComfyUI_temp_qario_00023_.png (3.68 MB, 1408x1792)
3.68 MB
3.68 MB PNG
>>
File: z-image_00100_.png (2.82 MB, 1536x1536)
2.82 MB
2.82 MB PNG
>>107523070
nice try deebster, i know what a dgx spark is actually useful for

>>107523071
>dirty 1girls best 1girls
i love being white because it adds to the juxtaposition

>>107523113
>thx anon
np gramps
>I was more referring to the e.g. s-video adapter
yeah you got me there, i have no idea what the fuck that is because i was born in the 21st century kek
>>
>>107523127
It's fine you are uninformed, but I was asking for tests.
You couldn't come up with any, so your reply is meaningless anyway.
>>
File: ComfyUI_02558_.png (1.42 MB, 784x1440)
1.42 MB
1.42 MB PNG
>>
File: ComfyUI_temp_qario_00032_.png (3.29 MB, 1984x1280)
3.29 MB
3.29 MB PNG
>>
File: z-image_00103_.png (2.97 MB, 1536x1536)
2.97 MB
2.97 MB PNG
>>107523139
>I was asking for tests.
no (You) weren't

>>107523142
damn ivanka got me with that subtle flipping-the-bird
>>
>>107523173
>>107522916
Blindness isn't something that can be cured
>>
>>107523173
i saw it and considered turning it into an upside down okay hand, but the middle finger was cool enough
>>
>>107523046
heh. he could of spent some time in chroma and automate the shit out of it so it produces proper unique faces. its slow but its all about the end result.
>>
people keep asking about zimage base.
what does model base even mean?
>>
>>107523142
Erika Kirk plsssssssssssssss
>>
File: 1764517098993.jpg (131 KB, 787x917)
131 KB
131 KB JPG
>>107521131
I am fucking tired of ai interpreting "pixie cut" as shaved sides garbage and I want it to FUCKING STOP NOW

This is a pixie cut, you fucking clanker dopes.
>>
>>107523350
Chinese culture
>>
>>107523170
me in the backseat
>>
>>107521469
w-would...
>>
>>107521469
>>107521590
I grew up around a lot of aunties like this. I never did anything naughty with them, and they never tried to do anything naughty with me. They were always strictly proper. However as an older man I am now convinced, ABSOLUTELY convinced that if, as a young guy, I had had the balls to simply walk right up to them and say "I want to play with your milf tiddies", they would have happily let me.

You miss all the shots you don't take...
>>
>>107522593
>they're the same picture.jpg
>>
>>107522593
I don't even know what vae means.
I always just use safetensors, ain't no time to use these snakeoils.
>>
>download concept lora
>try it
>okay.png
>wonder what it looks without it
>get better results
What's the point?
>>
>>107523514
NO BASE
>>
>>107523514
buzz, literally just earning buzz on civitai
>>
>>107523514
Who told you all loras are good? Anyone can make them
>>
You said my accusations against Chinese culture were baseless, yet the only baseless one I see around here is you...

hmmm
>>
>>107523514
sounds like it has a small data set? dont loras in general need a shit ton of diverse references to be good?
>>
>>107523494
ai models work on a smaller version of the final image called latent space, then upscaled into the final image by a separate model called the vae. you always use a vae, but there often is a widespread one everybody uses, and some attempts at providing alternatives with snake-oily benefits
>>
>>107523561
you're courting death
>>
>>107523639
Back to your opium den
>>
>>107523543
This. It's like saying every drawing is good because someone made it put in on the internet.

Practically anyone can train a lora, 8gb vram is enough to train a lora for most models, a large portion of people training loras don't have a clue as to what they're doing.
>>
Loras seem overrated to me anyway. Really only "useful" to lame 1girl posters who need to have 300 variations of some hyper specific fetish, but usually anything more complex has the lora ruin everything else worst case and best case just kills off any variation. Except its just all the same pose of her wearing glasses with only one lens while she does a split or something equally retarded, why do you need 300 of those images?
>>
>>107523567
No, not really, particularly if you are training on a person where all you need is ~20-30 images.

For a specific artstyle it *can* be more challenging if you want it to be able to do very varied output, since then you want variation of subjects within that artstyle to train on.

Overall loras require the least amount of images.
>>
bayse is going to drop and the level of poasting is going to increase to a speed that was previously believed to be impossible
>>
>>107523543
>>107523656
While true, it's also the case that a lora can be perfectly fine, but using it doesn't deliver the desired results, because it's trained on a set of data that is more heavily weighted towards things you don't want, and adding it to your mix skews your prompts away from the ideal you subjectively prefer. It doesn't necessarily mean the lora is bad. Though they can be, it can also just mean the lora wasn't designed with your use case in mind. And there are infinite use cases, so... yeah, it gets a little jumbled.

More than that, there is also the intersection problem. There's probably a better term someone coined for this that I can't think of right now, but basically every single lora- or, forget AI for a second. Every single *set of ideas* in the realm of cognition, while typically defined by what makes it distinct, is going to share something in common with many other sets of ideas. As you overlay multiple sets of ideas together, eventually what's going to become most pronounced is that which intersects the most possible sets. In other words, as you use more loras, there's going to be an almost inevitable slide towards apparent "sameyness" of style just because the things they share in common end up getting multiplied. Which is why there's so many anime AI images that just have that immediately apparent "lazy AI slop" look to them. The trend is towards the mean, it takes effort and careful pruning to nurture a concept towards a distinct extreme. Hell, your own DNA kind of works that way.
>>
>>107523678
>Loras seem overrated to me anyway.
Only because you limit yourself, mixing loras (particularly art) usually bring out better results than any of the loras individually, even if you allow one lora to dominate (have much higher strength) it will still benefit from the way the training has understood the other loras.
>>
>>107523689
Suppose that would be for a character lora. Yes, style loras, or lighting loras, even video loras require some decent variation. Its fun to slop lighting loras to get different flavors.
>>
File: ligma twins.png (75 KB, 558x412)
75 KB
75 KB PNG
twinflow training for pixart sigma is a go

i really should have done sd1.5 first instead. pixart sigma's text encoder is 9gb so alltogether you need 17GB of vram to put everything into the GPU, so I have to use CPU encode, which is slower, so it will actually take like 4 days for a test run of 5000 steps of a 10k image dataset instead of a few hours
>>
>>107523713
I think the problem isn't with the lora itself, but with the model being heavily distilled and turbo'd from the start, ends up making every lora essentially degrade the model
>>
>>107523747
>twinflow pixart sigma
What are these incantations of which you speak ?
>>
>>107523754
That has always been true, but I guess people noticed it more when Flux gained popularity since before that people typically used full finetunes which doesn't suffer from this.

Question is if this is due to training on a distilled model like Turbo or Flux, or if it is a core issue with how loras work themselves. I don't know how SDXL-based finetune loras are affected since I don't use them.
>>
>>107523763
Twinflow just came out for Qwen-Image. It's like lightx2v but allegedly better. It also appears to be really cheap and quick to implement and train.

Only an evening to adapt a well-known architecture and only a week for a training run

So I'm trying it on pixart sigma since it's a tiny model and I only have a 16gb card. Let's see if I can make even slightly coherent 1-step images with it in a few days I guess


The ultimate reason why Twinflow matters is 1-step/few-step WAN SOTA video but twinflow on a 14B model probably requires 8xH100 so a company will have to do that
>>
>qwen image
use case?
>>
>>107523747
Claude came up with the brilliant idea (more like I'm a retard and didn't think of it) to just use bitsandbytes 8 bit for the text encoder. Now I can do 5000 steps in 13 hours

>>107523824
Some anons itt like it
>>
File: 1747832260663318.png (2 KB, 122x94)
2 KB
2 KB PNG
why does the queue number go 1-2-4-5?
>>
>>107523824
NTA but it's prompt comprehension is the best of local models, as is its spatial awareness and text.

That said it is big and relatively slow, I can understand why it hasn't really taken off.
>>
>>
I think -you interrupted me - UnloadAllModels node doesn't actually do shit.
At least on linux. it's better to just ctrl+c after few gens to restore the memory pool back to the os. Then restart.
>>
>>107523896
even compared to flux2?
>>
>>107523908
To add: unloadAllModels will probably release memory reservation inside cum ui but it is still reserved to the os -> .
>>
>>107523908
Just do --cache-none or whatever it is in that case
>>
>>107523824
>try qwen to give character genned with another model a new pose
>turns it into most generic anime character
It's trash.
>>
>>107523937
QIE and QI are different
>>
>>107521131
>https://rentry.org/animanon
fuck off with your shitty drama ran
>>
>>107523930
No i don't do this because it makes no sense to use some "hidden" cum ui flags.
>>
>>107523961
if the normal memory management is not enough: fuck off
>>
>>107523909
Yeah, probably.
>>
>>107523959
Anons don't look at previous OPs and just copy paste whatever was in the previous thread. Just need to do a proper bake without this garbage.
>>
>>107523046
lmaoo, that's good, that's really good
>>
>>107523984
>trust me bro
>>
File: OMG OMG OMG OMG.png (95 KB, 1766x562)
95 KB
95 KB PNG
WE ARE BACK BOYSSSSSS
https://www.youtube.com/watch?v=xb2fjZa_L74
>>
>>107523999
>b-but the Chinese Cultur-ACK
>>
>>107523999
>>107524016
2 more weeks ;)
>>
>>107523999
At least they haven't said yet that they've changed their minds and won't be publishing it, which is good news.
>>
>>107524016
kek
>>
>>107523987
Ran is a horrible person.
>>
>>107523993
You don't have to, so you can pull the stick out of your ass.
>>
>>107523999
My bet on a December release is looking so sweet

>muh chinese culture fags on suicide watch
>>
File: 1740332810108866.webm (3.72 MB, 1280x720)
3.72 MB
3.72 MB WEBM
>>107523999
>not too long
>>
>>107523999
>not to long
That's chinese culture speak, I tell ya!!
>>
>>107524118
>It's been TWO WEEKS SINCE TURBO RELEASED, REEEEEEEEEE
>>
>>107524134
yeah but what means "not too long"? for them it can mean 10 years lol, the simple fact they haven't a release date means they're still far from finished
>>
>>107523999
>but not too long
In Valve time
>>
>>107524142
>the simple fact they haven't a release date means they're still far from finished
No, it can just as well mean the release is imminent, as in them putting the final touch to the model, at which case they will wait with announcing it until that last bit is done.

These models practically never have release dates, either you hear some rumors or it is just released out of the blue.
>>
>>107524161
>These models practically never have release dates
this is a fair point, Alibaba never announced a release date for any of their models, we knew Wan 2.1 and Wan 2.2 were about to be released, but we never had a precise date
>>
>>107523999
They haven't said whether or not it's through an API.
>>
File: ComfyUI_00100_.jpg (1.5 MB, 1613x2074)
1.5 MB
1.5 MB JPG
what is the current meta with models/workflow and UI these days? I'm still using illustrious standalone comfyUI, I want to create comics.

Also
>still no AI board.
>>
meh, who cares, it will release when they decide to.
>>
>>107524217
no one puts a base model on the API space, they're all finetuned
>>
>>107523987
Also need to remove redditUI from the template. This shit has been going downhill for a while now.
>>
File: -1309939556.gif (291 KB, 700x704)
291 KB
291 KB GIF
>previous thread
Tell the llm to keep the enhanced prompt under a number of tokens you want if you don't want yapping. Use Instruct and not Thinking models. Instruct just shits out the result and doesn't talk.
>>
>>107524236
>Tell the llm to keep the enhanced prompt under a number of tokens you want if you don't want yapping.
but it'll stop midway though no? you can't really control how many tokens the llm will say (you can influence it a bit by saying you want a certain amount of tokens on the system prompt I guess)
>>
>>107524236
>Use Instruct and not Thinking models. I
what do you suggest? even the ""non-thinking"" Qwen 3 4b has a thinking process
>>
File: based.png (2.06 MB, 1400x1406)
2.06 MB
2.06 MB PNG
>>107524117
>My bet on a December release is looking so sweet
I can feel they're release that shit on christmas
>>
>>107523999
Once the base model is finished, do you think they'll redo a turbo model out of it, something like turbo v2?
>>
File: Untitled-1ffffffff.jpg (507 KB, 3840x1840)
507 KB
507 KB JPG
This body weight slider is nice.
>>
what's seedvr2? is it any good?
>>
File: 1734603793200620.png (747 KB, 1782x570)
747 KB
747 KB PNG
>>107524287
>tfw you change race if you gain a little weight
>>
>>107523999
https://www.youtube.com/watch?v=ZFz4L6MSfMU
>>
>>107524328
Oh man, the 80s was such a vibe, I was born too late for this kino shit :(
>>
>>107523792
According to its paper, zit has a different distillation concept that, in my understanding, makes it significantly easier to train than flux, which pursues a different distillation concept.
This is never taken into account.
The reason we don't see any decent fine-tuning is simply that hopes for a timely base model release were raised on the very first day, and no one is going to throw away a few hundred or thousand dollars for a turbo model for a few days.
>>
>>107524340
you're born just in time to enjoy chinese culture on your very own computer
>>
>>107524361
kek
>>
The base model is going to super censored.
>>
File: ZIT_00034_.jpg (723 KB, 1600x2296)
723 KB
723 KB JPG
I have no idea why I genned this.
>>
File: ZIT_00045_.jpg (758 KB, 1600x2296)
758 KB
758 KB JPG
>>107524388 (You)
It was to test for this ultimately I guess.
>>
what is the model that a 5090 cannot run?
>>
>>107524452
hunyuan image 3 :)
>>
File: file.png (2 KB, 80x55)
2 KB
2 KB PNG
>publish new node
>get 18 downloads
bros im wonnering!!!
>>
>>107524452
Flux 2, even at Q8 it's 35 gb
>>
>>107524502
I'm glad the StackMoarLayersFags got pwned by Z-image turbo, that'll put an end to that lazy era
>>
>>107524507
>that'll put an end to that lazy era
Lol. the west would rather proclaim any chinese product illegal first
>>
>>107523999
Chinese culture enthusiast here:
Nothing is coming. When they say not too long, they mean for the greenlight to release the model which is by no means guaranteed basically guaranteed to be a no.
>>
File: that's right.png (431 KB, 800x582)
431 KB
431 KB PNG
>>107524582
you look afraid, China will win again
>>
>>107524501
based anon, keep up the good work
>>
>>107524502
more like Flux....poo!!!!!!!!!
you have my permission to use my joke in subsequent /ldg/ threads
>>
>>107524501
Share your snakeoils bro
>>
>>107523999
We're back, we're so fucking back!
https://youtu.be/OATUEO0PxLQ?t=226
>>
>>107524522
Isn't that already the case for z model? iirc they had something in their licence about it being illegal in EU or implying something like that
>>
File: 1762328974672205.png (939 KB, 1920x1080)
939 KB
939 KB PNG
>>107524773
>>
when can i download bing dall-e 3
how can their security be so tight
can you imagine what the world would be like if it leaked
can you imagine the potential of all those things that damned dog has kept from us
>>
>>107524810
dalle 3 isnt even good
>>
>>107524810
I don't get why people still like dalle 3, the drawings are so Ai looking and the real pople are so plastic, only Midjourney has really soulful styles imo
>>
>>107524810
dall-e 3 sucks now, especially with nano banana out
>>
File: 1758302416705177.png (200 KB, 1390x963)
200 KB
200 KB PNG
>>107524236
>"The final answer must be less than 400 words"
>it went for 485 tokens (400 words is usually ~520 tokens)
pretty nice, I'm sure they've been trained to count words, I always found this shit to be useless, not anymore lol
>>
>>107524935
Go lower than 0.8 temp. This isn't Mistral
>>
>>107524981
I have "use_model_default_sampling" activated to it's using the default sampling parameters of that specific model (so the values you're seeing are being overridden)
>>
>>107525014
And the default is?
>>
>>107525018
https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507#best-practices
>>
>>107525027
Do you have system prompt? Try to tell it there and give it soem autistic personality.
>>
>midjourney
>sovl
>>
Does it matter in ai-toolkit if you pick 512+768 or just 768 training ?
>>
File: 00164-1269275483.png (1012 KB, 1152x896)
1012 KB
1012 KB PNG
>>107522781
>People should stop looking so thirsty for models in the future and play it a little cooler.
>>
File: ZiT.png (1.62 MB, 1280x720)
1.62 MB
1.62 MB PNG
>>
>>107525281
holy bad, prompt issue
>>
File: ZiT.png (1.81 MB, 1280x720)
1.81 MB
1.81 MB PNG
>>107525281
>>
>>107522055
it is Vue based they didn't lie about that, but Vue is just a framework like React that abstracts away direct DOM manipulations
>>
>>107525278
what's the prompt to get the youtube controls?
>>
>>107525477
>The youtube video controls are visible
>>
why is wan so bad at making pubes. if i write pubes i don't want a gorilla. i don't want it to be localized on her breasts or forehead either
>>
File: ZiT.png (1.44 MB, 1280x720)
1.44 MB
1.44 MB PNG
>>
File: file.png (62 KB, 594x773)
62 KB
62 KB PNG
guys i think /ldg/ is leaking
>>
File: ZiT.png (1.3 MB, 1280x720)
1.3 MB
1.3 MB PNG
>>107525672
keeek, "Chinese Culture" will have another meaning once base will be released, it won't be a pejorative name anymore, trust the chinks
>>
>>107525680
They should just call it Chinese Culture at this point. Z-Chinese Culture.
>>
you've been hit by...
You've been struck by...
Z-Chinese Culture
>>
File: 1758782107446.jpg (2.58 MB, 2048x2048)
2.58 MB
2.58 MB JPG
>>107524835
>>107524848
>>107524926
i like it a lot for conceptual brainstorming, it's really pleasing for me for coming up with a wide range of concepts, when all my experience with locals is "you get the one thing they trained the model on". it's difficult to aim for specifics, but i like that it gives me a lot of things i didn't consider.

i'm not tribal, idgaf, if you guys can show me a one-button solution to gen these then please share. i can't find anything that isn't just " anime ". i've burned away hundreds of hours into juggling loras to try to approach styles that are more interesting to me and it just feels like infinite dead ends trying to converge on one single idea, and all that effort becomes useless when i want to switch to a new idea.
>>
>>107525760
pony diffusion
>>
File: ZiT.png (1.14 MB, 1280x720)
1.14 MB
1.14 MB PNG
Mission failed, we'll get her next time!
>>
>>107525774
Everything genned by pony looks like pony.
>>
File: 1754001811154676.png (1.76 MB, 1280x720)
1.76 MB
1.76 MB PNG
>>107525760
>>107525812
Z-image turbo tried its best
>>
>>107525784
Gweilo, stop this spam at once!
>>
>catpissjulien
>>
>>107525784
Based
>>
>>107525871
Do you have pixiv
>>
>>107524219
Probably still illustrious/noob until Zit or Qwen is anime finetuned.

But you might find Qwen Image Edit useful
>>
>>107523824
It's one of the best models regardless.

Certainly one of the models to try if training a LoRa didn't work so well on the faster SDXL or ZIT.
>>
>>107523824
>use case?
none, qwen image edit on the other hand is still a really useful model... but it'll be quickly destroyed by the upcoming Z-image edit
>>
File: ZiT.png (1.31 MB, 1280x720)
1.31 MB
1.31 MB PNG
https://www.youtube.com/watch?v=mMfHIEcmaA8
>>
Saar, buy lora. Promise very good.
>>
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo/discussions/81#693ab19f0da2263349930ba2
>根据中国一个社交平台上说的,基础模型是会公布的,但现在似乎还在训练当
Translation
>According to a Chinese social media platform, the foundational model will be released, but it appears to still be undergoing training at this time.
based if true
>>
File: 1747527220856557.png (151 KB, 1470x777)
151 KB
151 KB PNG
>>107523999
>>107526107
https://xcancel.com/Ali_TongyiLab/status/1999412639529861318#m
they've been dead silent for more than a week and now they're talking about base again, looks like they are close to the finish line
>>
>>107526093
umm how else is ai going to fund itself chud haha
>>
>>107526123
Have you been stalking them every day? You really do have a burning passion for image generation. Too bad even this thread is sort of empty...
>>
>>107524297
an upscaler for images and video. it works but I found the outputs to be darker and skintones more red when I used it
>>
>>107525986
>can't keep exact view
>useful
wumao typed this
>>
can NewbieAI run in comfy? Nodes from official workflow are still missing and i'm not setting up their fucking fork
>>
>>107526145
>You really do have a burning passion for image generation.
thanks anon
>>
>>107526160
for pure edit yeah I'm not a big fan of the zoom it, but when you use it as a character lora it works well on anime
>>
>>107526163
No problem. I'm still waiting to see a single image from you.
>>
What will we kvetch and doompost about after Z base?
>>
>>107526187
Where's your image anon?
>>
>>107526202
Multiple images in OP colleague. I'm still at work, but will begin creating new ideas in few hours.
>>
>>107526093
you can upload the shittiest slop lora with the worst example videos/images and put it on early access, and if anyone calls you out for your shitty grift you can just hide their comments. great site
>>
fresh bread
>>107526185
>>107526185
>>107526185
>>
>>107526226
kill yourself schizo
>>
>>107526123
hes replied to the turk then its surely coming
>>
>>107523075
very nice, could you share the workflow for this?
>>
>>107522171
>>
>>107521170
wow that's a good one
>>
>>107521354
wood
>>
>>107523113
IDE is better than SATA, change my mind



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.