[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


Janitor application acceptance emails are being sent out. Please remember to check your spam box!


[Advertise on 4chan]


50/50 Edition

Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>107386206

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe
https://github.com/ostris/ai-toolkit

>Z
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo
https://comfyanonymous.github.io/ComfyUI_examples/z_image/

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2298660
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
File: zimage.png (2.52 MB, 1280x1280)
2.52 MB
2.52 MB PNG
>you cant have a lora for a disti-AACK
>>
>>107388729
the list of chinese rugpulls should be specific.

ace step 1.5 is surely one for the list.
>>
>why do you even want the base model? its not like it matters or anything
said by newfags who dont know what distillation means or trolls (more likely the former)
>>
File: z-image_00260_.png (2.95 MB, 1168x2048)
2.95 MB
2.95 MB PNG
>>
>>107388751
You have too much, learn to be happy with less.
>>
>>107388743
ahahahahahahahahahah it has plasters

>>107388754
why the white frame?
>>
File: 1749823899328585.png (3 KB, 94x44)
3 KB
3 KB PNG
Civitai isnt a model sharing site, it's a Civitai coin farming site for cloud generation.
>>
File: 1764526383242205.png (1.22 MB, 1584x672)
1.22 MB
1.22 MB PNG
Ehemmmmmm
>>
>>107388751
we're gonna need lightning lora to not destroy the details for base and very good loras on top of it for it to beat turbo, which is fast and good already, and doesnt even have the v2 dedistillation ostris adapter and more community knowledge on how to train shit
>>
File: ComfyUI_08283_.png (3.6 MB, 1280x2048)
3.6 MB
3.6 MB PNG
>>
I love korean women so fucking much it's unreal
>>
>>107388783
Advertising fag didn't answer the questions because they are false and so moving to neo from comfy is a hassle
>1. Can neo load up workflows from comfy right away?
>2. Can you generate almost the exact same images as comfy if the seed and the rest of the params are the same?
>>
File: z-image_00262_.png (2.87 MB, 1168x2048)
2.87 MB
2.87 MB PNG
>>107388765
A regal young woman standing tall in flowing blue robes, holding a gleaming golden scepter. She has pale skin, long blonde hair, delicate earrings and bangles, and subtle makeup. Her confident gaze meets the viewer's directly as if captured on nostalgic 1990s film flash photography, set against the swirling, crimson surface of Jupiter.

>>107388783
why are you shilling
>>
>>107387555
is this bait? Nochekaiser is the hackiest hack that ever hacked, he LITERALLY trains almost all (if not all) of his single-subject SDXL anime character loras (must of which are wholly useless and of characters the model already knew in terms of Pony / Illustrious whatsever) at 512x512. Like look at the metadata, you will see I'm not making this up, guy shits out garbage with horrible captions, he's only popular because of his unusually high rate of output.
>>
>>107388803
show feeeeeeeeets
>>
File: ComfyUI_00243_.png (1.85 MB, 1504x1024)
1.85 MB
1.85 MB PNG
>>
So why doesn't Comfy just have a lora trainer built into the UI?
>>
>>107388839
i think you can train loras using comfyui
>>
>>107388786
>Kelly Baltazar’s life took a dramatic turn in 2018 when she was arrested at Georgetown University for possessing marijuana and cocaine.
>>
File: ComfyUI_155562_.jpg (446 KB, 1920x1440)
446 KB
446 KB JPG
>>107388839
i thought it did but i've never messed with it or looked too far into it
>>
>>107388839
it does but it's not very good yet. KohakuBlueLeaf made it.
>>
>>107388831
res_multistep does that with zimage

not sure why.
>>
>>107388853
what?
>>
File: zimage dalle lora.png (2.37 MB, 1280x1280)
2.37 MB
2.37 MB PNG
>>
A year from now you will remember how fast these threads moved during this release.
>>
>>107388878
yup, flux2 released, then the threads went crazy
>>
>>107388868
Just reading the lore.
>>
>>107388889
what has that to do with the image I posted?
>>
We have BFL to thank for this amazing release. So many Flux 2 gens itt, to think they managed to pull it off two years in a row.
>>
>>107388893
she's a halfbreed.
>>
Any advice on how to prompt z-image a little better? I want the table and chair setup to be like in alien earth, where it's a few steps INTO the ground rather than above it. I have such stated in the prompt but... doesn't seem to have done it. Good image none the less tho!
>>
>>107388904
who?
>>
1girl, asian
>>
>>107388729
Is there anything specific I should know before trying to train a lora on z image? Or is it just pretty much the same as Flux/Chroma?
>>
comfy charges $20 for their cloud instance with a credits system per image generated. meanwhile local chads running it on 3080 (or even renting cheap 5090) absolutely mog that garbage comfy cloud into oblivion
>>
File: z-image_00265_.png (1.37 MB, 1024x1024)
1.37 MB
1.37 MB PNG
>>
>>107388921
Mayli lol
>>
File: 1760385650860411.png (193 KB, 320x320)
193 KB
193 KB PNG
has anyone tried genning using only racial slurs and foul language? it might be the secret
>>
File: ComfyUI_156132_.jpg (204 KB, 1920x1440)
204 KB
204 KB JPG
https://civitai.com/models/2174504/
this loras artstyle is p nice
>>
>>107388926
check youtube
>>
>>107388910
Literally no idea if this will work, but the official term for what I think you are getting at is called a conversation pit? Popular in the 70s, so it bleeds into a lot of sci-fi design.
>>
>>107388910
ask llm for a detailed prompt
>>
>>107388957
Interesting, I asked for a straight "conversation pit" and this is what it gave me.
>>
File: z-image_00269_.png (2.75 MB, 1168x2048)
2.75 MB
2.75 MB PNG
>>107388910
run an image of the scene you want through a naturalistic captioner
>>
>>107388925
did you prompt for these pretty eyes?
>>
File: 1761735741866348.png (330 KB, 703x617)
330 KB
330 KB PNG
here /ldg/, grab a z-image nipple
>>
>>107388990
cleavage is hotter anyway
>>
>>107388974
Aha, weird, sorry. If you google the term, it'll give you examples, but I guess it wasn't tagged correctly. Good luck anon.
>>
It doesn't know obscure slavshit cars.
>>
File: 0168.png (55 KB, 600x578)
55 KB
55 KB PNG
>>107388990
keep posting
>>
>>107388944
but Mayli wasnt posted, are you faceblind?
>>
File: zimage dalle lora.png (3.02 MB, 1536x1536)
3.02 MB
3.02 MB PNG
>>
>>107389023
>>
File: z-image_00271_.png (3.32 MB, 1168x2048)
3.32 MB
3.32 MB PNG
>>107389007
why would it artyom its a chink model
>>
there seems to be huge image degradation when using nsfw loras on zit, should we just wait til base?
>>
The bottleneck now is my current jizz production
>>
the mayli guy is back.
>>
>>107388987
I prompted for the makeup. The most relevant parts:

>The asian woman has heavy drama wingtip eyeliner, bold colored smokey eyeshadow on her eyelids, kohl, mascara, shiny latex lipstick and jewelry
>She is deeply in love and lust with the viewer.
None of the rest is particularly relevant to eyes.
>>
File: ComfyUI_156067_.jpg (250 KB, 1440x1920)
250 KB
250 KB JPG
wish i had that one anons prompt
>>
>>107389047
thank you anon, it's very nice
>>
>>107388999
All good Anon! Thank you for the suggestion!
>>
>qwen image
is it dead
>>
>>107389037
yeah any more than one lora makes it shit the bed
>>
>>107389061
Depends how good the Z-Edit is.
>>
>>107389061
everything will be/is
>>
>train a lora on a face
>it learns it instantly
>train a lora on a pussy
>it doesnt know what the fuck to do after 10000 steps
explain
>>
>>107389061
qwen is still the best inpainting and editing tool
>>
Which Z-Image quant should i download if i'm on a 3090Ti?
>>
>>107389080
bad dataset, didnt test in cumfart and at 0.7 strength
>>
>>107389081
but flux 2 mogs qwen in every way
>>
>>107389061
The future is Z
>>
>>107389087
bf16
>>
>>107389099
Thanks G
>>
remember perfect hand loras?
>>
>flux 2 realism loras out already
>>
>>107389096
>>
File: 1753285536630605.jpg (1.53 MB, 1920x1920)
1.53 MB
1.53 MB JPG
>>107388951
>>
>>107389122
remember detail loras?
>>
>>107389141
for me its the dog goat
>>
>>107389122
Insane how far this stuff has come in the past few years!
>>
File: combined_0114.jpg (1.2 MB, 3800x2040)
1.2 MB
1.2 MB JPG
>>
>>107389172
Flux is closer to the sovl but Z has better looks.
>>
>>107389056
No problem
>>
File: combined_0145.jpg (1.02 MB, 4043x2040)
1.02 MB
1.02 MB JPG
>>
reee, diffusion-pipe has no block swap for z
>>
>>107389061
>chroma
is it dead
>>
File: z-image_00274_.png (3.49 MB, 1168x2048)
3.49 MB
3.49 MB PNG
>>
>>107389190
do you really need it? you could probably get it down to like 8gb with ai toolkit
>>
HDD speeds are nuts
>>
>>107389127
https://civitai.com/models/2180562/boreal-flux-dev2-boring-reality-lora-for-flux2-dev?modelVersionId=2455415

Based

Flux.2 Chads are ahead of Z. Remind me again why use the vramlet model when you can prompt this model in any language and it also knows Japanese?
>>
File: 1734365558842318.png (1.69 MB, 1120x1440)
1.69 MB
1.69 MB PNG
Z does this better than any other model i've tested. still a tough prompt though
>>
>>107389187
flux2 still has hand problems.
>>
File: 1733626459504718.png (10 KB, 237x229)
10 KB
10 KB PNG
>>107389191
now all the loras i made will go into storage and never get touched ever again
>>
File: combined_0005.jpg (861 KB, 2040x4000)
861 KB
861 KB JPG
>>
>>107389217
neat!
>>
>>
Is there a good view angle control lora for Z?
>>
>>107389172
>>107389215
Flux.2 is also just kino at manga. Z may be overhyped after all. I patiently wait for Flux.2 Klein.
>>
File: ComfyUI_temp_uuabr_00007_.png (2.18 MB, 1280x1440)
2.18 MB
2.18 MB PNG
>>
>>107389215
vram required for training must be huge
>>
>>107389232
chroma is better than flux2, and can mess up the hands too just like flux 2.
>>
File: combined_0104.jpg (742 KB, 2040x3637)
742 KB
742 KB JPG
>>
>>107389215
I'll wait for the schnell distill.
>>
>>107389236
deliciously plump thighs
>>
>>107389274
did the prompt fail to include the eyepatch?
>>
>>107389215
>everything melting together
Ahh flux..you never change. How is this model 32B....
>>
>>107389274
What tool are you using to run the 32B version of Qwen3-VL?
>>
File: ComfyUI_temp_uuabr_00008_.png (2.56 MB, 1280x1440)
2.56 MB
2.56 MB PNG
>>107389292
indeed, thanks to newbie anon for his good gens
>>
>>107389172
Can you explain for the retards in the back who missed previous threads? WTF is a qcp
>>
>>107389191
it was inferior to qwen day 1
>>
File: combined_0182.jpg (728 KB, 3911x2040)
728 KB
728 KB JPG
>>107389303
Negative, prompt is too long for a comment, but it missed the eyepatch. I wasn't sure if it made sense to include the reference images because the point isn't to evaluate the Qwen VL, but the results are generally close enough to be a good baseline.
>>
>>107389331
quantized cp
>>
>>107389345
the important thing here is the prompt and the two results, not the initial image, because too much style will always be lost in all VLLMs
>>
File: combined_0006.jpg (1.16 MB, 4800x1720)
1.16 MB
1.16 MB JPG
>>107389331
My categorization, disregard

>>107389309
A basic python script using transformers to run https://huggingface.co/coder3101/Qwen3-VL-32B-Instruct-Heretic
>>
File: ComfyUI_temp_uuabr_00014_.png (2.33 MB, 1280x1440)
2.33 MB
2.33 MB PNG
>>
>>107389345
>>107389274
1 looks the best stylistically, but 3 appeals the most to 1girl jeets
>>
>>107389305
There's another one that's trained on both Z and Flux.2, and Flux.2 one mogs Z

https://civitai.com/models/1662740?modelVersionId=2449027

Even the trainer acknowledges Flux.2 version is the best. There's only so much you could get out of 6B model, though it's fun to play with. Seed variation Chads just know there's much more quality to get out of Flux.2 over a model that suffers from same symptoms that other Chink models suffer from (weak training, overfit which kills seed variety and causes sameface)
>>
>>107389191
last update to official radiance was ~2-3 days ago and people are also still finetuning other chroma models
>>
File: combined_0066.jpg (512 KB, 2040x3760)
512 KB
512 KB JPG
>>
File: file.png (911 KB, 1779x757)
911 KB
911 KB PNG
>>
>>107389399
no.. z-sisters our response?
>>
Why is the text encoder for z-image 7.49gb?
>>
File: 1739303180376008.png (9 KB, 246x216)
9 KB
9 KB PNG
>>107389420
text encoders are big
>>
>>107389399
https://civitai.com/models/1134895
hope he does this one for flux2 too then
>>
>>107389420
because its an llm which actually allows it to understand the prompt much deeper and because its unloaded during generation anyway
>>
>>107389373
Oh hey this is much better than the outdated script on Qwen-VL's official page. Thanks for the link.
>>
>>107389403
>just that one sarah peterson anon making loras
yeah not ded lol
>>
>>107389191
After Z gets a substantial nsfw tune.
>>
File: 00045-3047408638.png (1.88 MB, 1152x1728)
1.88 MB
1.88 MB PNG
>>107389420
If that's too big, you can always use smaller gguf text encoder. Like this: https://huggingface.co/Mungert/Qwen3-4B-abliterated-GGUF/tree/main
>>
>>107389417
He didn't say Flux.2 was the best. In fact he said Qwen version was the best.

>--Qwen
>Works perfectly and much better then other versions.
>>
>>
>>107389460
>>
File: ComfyUI_temp_puclc_00002_.jpg (628 KB, 2000x1440)
628 KB
628 KB JPG
Btw whatever happened to the talks about the Noob dataset being used for tuning Zimg further?
>>
>>107389327
Can you share the prompt?
>>
>>107389479
It has been about 48 hours so it's likely abandoned.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.