[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: 1758995333207728.png (2 MB, 864x1536)
2 MB
2 MB PNG
Previous /sdg/ thread : >>107328651

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Advanced UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge Classic: https://github.com/Haoming02/sd-webui-forge-classic
reForge: https://github.com/Panchovix/stable-diffusion-webui-reForge
Stability Matrix: https://github.com/LykosAI/StabilityMatrix

>Z-Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>Flux.2 Dev
https://comfyanonymous.github.io/ComfyUI_examples/flux2
https://huggingface.co/black-forest-labs/FLUX.2-dev
https://huggingface.co/city96/FLUX.2-dev-gguf

>Qwen Image & Edit
https://docs.comfy.org/tutorials/image/qwen/qwen-image
https://huggingface.co/Qwen/Qwen-Image
https://huggingface.co/QuantStack/Qwen-Image-GGUF
https://huggingface.co/QuantStack/Qwen-Image-Distill-GGUF
https://huggingface.co/QuantStack/Qwen-Image-Edit-2509-GGUF

>Text & image to video - Wan 2.2
https://docs.comfy.org/tutorials/video/wan/wan2_2
https://huggingface.co/QuantStack/Wan2.2-TI2V-5B-GGUF
https://huggingface.co/QuantStack/Wan2.2-T2V-A14B-GGUF
https://huggingface.co/QuantStack/Wan2.2-I2V-A14B-GGUF

>Chroma
https://comfyanonymous.github.io/ComfyUI_examples/chroma
https://github.com/maybleMyers/chromaforge
https://huggingface.co/lodestones/Chroma1-HD
https://huggingface.co/silveroxides/Chroma-GGUF

>Models, LoRAs & upscaling
https://civitai.com
https://tensor.art
https://huggingface.co
https://tungsten.run
https://yodayo.com/models
https://www.diffusionarc.com
https://miyukiai.com
https://civitaiarchive.com
https://civitasbay.org
https://www.stablebay.org
https://openmodeldb.info

>Index of guides and other tools
https://rentry.org/sdg-link

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/u/udg
>>>/tg/slop
>>>/trash/sdg
>>>/vp/napt
>>
>mfw Resource news

11/27/2025

>Z-Image-Turbo: Distilled State-of-the-art image generation model with 6B parameters
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>Official FLUX.2 Prompting Guide
https://docs.bfl.ai/guides/prompting_guide_flux2

>AnchorOPT: Towards Optimizing Dynamic Anchors for Adaptive Prompt Learning
https://github.com/zhengli97/ATPrompt

>MobileI2V: Fast and High-Resolution Image-to-Video on Mobile Devices
https://github.com/hustvl/MobileI2V

>Monet: Reasoning in Latent Visual Space Beyond Images and Language
https://github.com/NOVAglow646/Monet

>UltraViCo: Breaking Extrapolation Limits in Video Diffusion Transformers
https://thu-ml.github.io/UltraViCo.github.io

>Deep Parameter Interpolation for Scalar Conditioning
https://github.com/wustl-cig/parameter_interpolation

>STARFlow-V: End-to-End Video Generative Modeling with Normalizing Flows
https://github.com/apple/ml-starflow

>iMontage: Unified, Versatile, Highly Dynamic Many-to-many Image Generation
https://kr1sjfu.github.io/iMontage-web

>The Consistency Critic: Correcting Inconsistencies in Generated Images via Reference-Guided Attentive Alignment
https://ouyangziheng.github.io/ImageCritic-Page

>AlignBench: Benchmarking Fine-Grained Image-Text Alignment with Synthetic Image-Caption Pairs
https://dahlian00.github.io/AlignBench

11/25/2025

>FLUX.2: Frontier Visual Intelligence
https://bfl.ai/blog/flux-2

>FLUX.2-dev-GGUF
https://huggingface.co/orabazes/FLUX.2-dev-GGUF

>FLUX.2 Day-0 Support in ComfyUI: Frontier Visual Intelligence
https://blog.comfy.org/p/flux2-state-of-the-art-visual-intelligence

>Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens
https://wakalsprojectpage.github.io/comt-website

>DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation
https://zehong-ma.github.io/DeCo

>Syn-GRPO: Self-Evolving Data Synthesis for MLLM Perception Reasoning
https://github.com/hqhQAQ/Syn-GRPO
>>
File: deNE_zi_00009_.png (2.12 MB, 2048x1216)
2.12 MB
2.12 MB PNG
>mfw Research news

11/27/2025

>Progress by Pieces: Test-Time Scaling for Autoregressive Image Generation
https://grid-ar.github.io

>Qwen3-VL Technical Report
https://arxiv.org/abs/2511.21631

>Video Generation Models Are Good Latent Reward Models
https://arxiv.org/abs/2511.21541

>Generalized Design Choices for Deepfake Detectors
https://arxiv.org/abs/2511.21507

>MIRA: Multimodal Iterative Reasoning Agent for Image Editing
https://arxiv.org/abs/2511.21087

>DiverseVAR: Balancing Diversity and Quality of Next-Scale Visual Autoregressive Models
https://arxiv.org/abs/2511.21415

>Canvas-to-Image: Compositional Image Generation with Multimodal Controls
https://snap-research.github.io/canvas-to-image

>Which Layer Causes Distribution Deviation? Entropy-Guided Adaptive Pruning for Diffusion and Flow Models
https://arxiv.org/abs/2511.21122

>CtrlVDiff: Controllable Video Generation via Unified Multimodal Video Diffusion
https://tele-ai.github.io/CtrlVDiff

>From Diffusion to One-Step Generation: A Comparative Study of Flow-Based Models with Application to Image Inpainting
https://arxiv.org/abs/2511.21215

>Restora-Flow: Mask-Guided Image Restoration with Flow Matching
https://arxiv.org/abs/2511.20152

>PromptMoG: Enhancing Diversity in Long-Prompt Image Generation via Prompt Embedding Mixture-of-Gaussian Sampling
https://arxiv.org/abs/2511.20251

>Text-guided Controllable Diffusion for Realistic Camouflage Images Generation
https://arxiv.org/abs/2511.20218

>OmniAlpha: A Sequence-to-Sequence Framework for Unified Multi-Task RGBA Generation
https://arxiv.org/abs/2511.20211

>FlowerDance: MeanFlow for Efficient and Refined 3D Dance Generation
https://arxiv.org/abs/2511.21029

>EmoFeedback2: Reinforcement of Continuous Emotional Image Generation via LVLM-based Reward and Textual Feedback
https://arxiv.org/abs/2511.19982
>>
First for containment general
>>
>mfw MORE Research news

>From Inpainting to Layer Decomposition: Repurposing Generative Inpainting Models for Image Layer Decomposition
https://arxiv.org/abs/2511.20996

>Beyond Realism: Learning the Art of Expressive Composition with StickerNet
https://arxiv.org/abs/2511.20957

>Test-Time Alignment of Text-to-Image Diffusion Models via Null-Text Embedding Optimisation
https://arxiv.org/abs/2511.20889

>Effects of Initialization Biases on Deep Neural Network Training Dynamics
https://arxiv.org/abs/2511.20826

>Training-Free Diffusion Priors for Text-to-Image Generation via Optimization-based Visual Inversion
https://arxiv.org/abs/2511.20821

>Text-Guided Semantic Image Encoder
https://arxiv.org/abs/2511.20770

>RubricRL: Simple Generalizable Rewards for Text-to-Image Generation
https://arxiv.org/abs/2511.20651

>Infinity-RoPE: Action-Controllable Infinite Video Generation Emerges From Autoregressive Self-Rollout
https://infinity-rope.github.io

>PixelDiT: Pixel Diffusion Transformers for Image Generation
https://arxiv.org/abs/2511.20645

>MotionV2V: Editing Motion in a Video
https://ryanndagreat.github.io/MotionV2V

>MapReduce LoRA: Advancing the Pareto Front in Multi-Preference Optimization for Generative Models
https://arxiv.org/abs/2511.20629

>DINO-Tok: Adapting DINO for Visual Tokenizers
https://arxiv.org/abs/2511.20565

>Flash-DMD: Towards High-Fidelity Few-Step Image Generation with Efficient Distillation and Joint Reinforcement Learning
https://arxiv.org/abs/2511.20549

>HBridge: H-Shape Bridging of Heterogeneous Experts for Unified Multimodal Understanding and Generation
https://arxiv.org/abs/2511.20520

>DesignPref: Capturing Personal Preferences in Visual Design Generation
https://arxiv.org/abs/2511.20513

>FREE: Uncertainty-Aware Autoregression for Parallel Diffusion Transformers
https://arxiv.org/abs/2511.20390

>The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image Generation
https://arxiv.org/abs/2511.20256
>>
hmmm looks like it might not know quokka
>>
File: PW_147475.jpg (2.84 MB, 2048x2048)
2.84 MB
2.84 MB JPG
I love the OP pic haha!
Good night, anons! I'm getting pretty tired! I hope to see you all tomorrow! :]
>>
there's quokka

debo+monkey
bun-chan+monkey
what's-her-face + bunchan

>>107343365
gn and happy thanksgiving
>>
i'm gonna have to give the win to flux2 on this prompt lol
>>
took way too many attempts
>>
File: deDL_zi_00048_.png (2.66 MB, 2048x1216)
2.66 MB
2.66 MB PNG
>>107343365
gn

>>107343447
>platypus quokka
>debo is monke
>bunchan has consumed pw
this went wrong in all kinds of good ways
>>
>>107343531
pw keeps getting shafted
>>
>>
File: deDL_zi_00049_.png (2.51 MB, 2048x1216)
2.51 MB
2.51 MB PNG
>>107343623
the quokka on this one, lmao
in my mind, hed sound like danny devito
>>
File: 0000_2.jpg (373 KB, 1772x2274)
373 KB
373 KB JPG
Kinda sad that the OP pick depicts the only regulars who stuck around considering the fact that there was another variation of this same kind of theme made ages ago, and it had like four times the avatars.
>>
>>107343748

No one is stopping you from generating an image that features everyone else you want in it.
>>
File: 0000_1.jpg (359 KB, 1772x2274)
359 KB
359 KB JPG
>>107343769
Weird how you seem to have taken that personally and gotten upset about it
>>
>says the person who found it sad that the op didn't feature other anons
>>
>>107343748
i remember that gen
i wasnt on it
>>
File: 0000_3.jpg (443 KB, 1792x2304)
443 KB
443 KB JPG
>hes still on this
>>
File: deDL_zi_00055_.png (2.81 MB, 2048x1216)
2.81 MB
2.81 MB PNG
>>107343748
time passes, people come and go. you can't control any of that. just try to have fun and enjoy yourself along the way

>>107343805
>that gen
the irony of all ironies, anon drew it by hand
>>
File: 0v_1~1.jpg (392 KB, 1808x2320)
392 KB
392 KB JPG
You cannot do something meritorious through the agency of a sin.
>>
>>107343829
>you cant do good if you're doing bad
you dont say
>>
>>107343844
You should learn to read more thoroughly if you're going to try a witty one up to critique someone's logic flow.
>>
>>107343896
you should take some prozac
>>
H A P P Y
T H A N K S G I V I N G
>>
>>107343906
Sorry you're so upset that your bumble got called out.
>>
>>107343910
says the guy butthurt someone made fun of his cliche
>>
>>107343916
That's not how you use the term cliche. You should just stop posting.
>>
>>107343930
Mr. logic over here cant even tell when someone's making fun of him
>>
>>107343943
>>107343910
>>
>>107343948
>>107343916
>>
Wait, they say Z-Image is 6B and the Turbo is a distilled version, and Base is non-distilled. Is Base still 6B just unfinetuned, or bigger? More memory usage for the parameters?
>>
>>107343955
>>107343930
>>
>>107343962
>>107343943
>>
>>
File: rain.webm (3.97 MB, 512x512)
3.97 MB
3.97 MB WEBM
>>107336394
>>
>>107343959
far as i know, just unfintuned. its being worked on.
oh and they personally reached out for noob's dataset so. two more weeks until sdxl, flux, chroma/etc are permanently curb stumped.
>>
>>107344576
Are they going to censor it?
>>
>>107344596
i can make shit like picrel on turbo, i think it's gonna be fully uncensored.
>>
i miss schizo anon
>>
>>107344658
Thanks for the nigbobump.
>>
File: hifi.png (1.81 MB, 1536x1024)
1.81 MB
1.81 MB PNG
>>
>>107344600
If they are actually not gonna purge the naked loli cunny from training data, then china officially unapologetically won.
>>
>>107345094
i strongly doubt they'll waste time with censoring a gigantic dataset like that when they've already released an uncensored checkpoint that people (like myself) are fawning over.

also there's a tard in ldg that admitted to accidentally'ing some CP by pure chance so. their base dataset is not exactly clean either kek
so far, it's a VERY strong img2img realism generator. i think that's how i'm going to be using it until the fully trained base comes out.
>>
File: autumn river.webm (3.89 MB, 1920x960)
3.89 MB
3.89 MB WEBM
>>
File: autumn river 2.webm (3.81 MB, 1920x960)
3.81 MB
3.81 MB WEBM
>>
File: sdg.webm (3.47 MB, 1208x1440)
3.47 MB
3.47 MB WEBM
>>107343501
>>
what model should I use to generate book covers for my gay fanfic library?
>>
File: 00000-3448015618.jpg (1.5 MB, 1344x1728)
1.5 MB
1.5 MB JPG
>>
File: 53454534543.jpg (170 KB, 832x1216)
170 KB
170 KB JPG
>>
File: newoldnew_01023_.png (2.47 MB, 1024x1544)
2.47 MB
2.47 MB PNG
Anon, who recommended me DiffusionToolkit, you're a saint.
>>
>>107346673
Thanks!
>>
File: autumn river 3.webm (3.64 MB, 1920x960)
3.64 MB
3.64 MB WEBM
>>107343907
>>107346347
Happy Thanksgiving!
>>
File: 5-corpart.png (929 KB, 1152x896)
929 KB
929 KB PNG
>>107346673
Cool gen fella
>>
>discord tier general
>>
>>107346790
?
>>
File: newoldnew_00527_.png (2.41 MB, 1024x1544)
2.41 MB
2.41 MB PNG
>>107346734
Thanks. Want to try flux2, but I think my 3080ti is too potato for it.
>>
File: 4-corpart.png (1.41 MB, 1152x896)
1.41 MB
1.41 MB PNG
>>107346919
Ok
>>
>>107343260
really nice.

Finally got a quantized flux.2 to work. Thanks OP for the links!
>>
File: deNT_zi_00002_.png (3.82 MB, 2048x1216)
3.82 MB
3.82 MB PNG
gm, happy thanksgiving

>>107347328
>Finally got a quantized flux.2 to work.
what card and quant?
>>
>gm
>>
>>107347350
Q4_K_M to start. Might move up if vram coopertes.

gm and happy thanksgiving to US anons.
>>
How can I use Gemma 3 as a text encoder in Comfy? I know there is this
>https://huggingface.co/Minthy/RouWei-Gemma?not-for-all-audiences=true
But it's not what I'm looking for. I'm pretty sure I saw someone talking about Gemma 3 previously.
>>
>>
File: oldnew_00363_.png (2.69 MB, 1128x1432)
2.69 MB
2.69 MB PNG
>>107347389
Whats your card and time to generate 1 pic?
>>
>>107347530
4070ti 12gb
>Prompt executed in 555.71 seconds
>>
File: deNT_zi_00008_.png (3.46 MB, 1920x1152)
3.46 MB
3.46 MB PNG
>>107347552
>555.71 seconds
bruh
>>
>>107347577
>Prompt executed in 88.23 seconds
This is with Flux.1D and some post processing.The difference is that flux.1D doesn't have a clue what I'm asking.
>>
https://civitai.com/models/21281/gloria-tyler-and-grace-tyler-tyler-sisters-yu-gi-oh-arc-v-characters
I just started CUI a week ago. I cannot get this LORA to work. What Checkpoint do I use with this? I've tried a few different ones but they either end up not applying the LORA or become eldritch abominations. First time encountering this type of problem and wanna learn.
>>
While I try to set up Z-image on my system TensorArt seems to have access to it, I don't need to build a sentence that describes Quokka's anymore to get something that looks like a quokka, they are like better DALL-E 3 Quokkas
it's really cool.
>>
>>107347886
why is this dog so fucking furry
>>
Also Morning anons :)
>>107347917
>*bark*
>>
File: PW_147480_.png (3.31 MB, 2048x1440)
3.31 MB
3.31 MB PNG
Good morning, anons! I hope everyone is doing well :]
HAPPY THANKSGIVING!!
I decided that i'm not going to work today hahaha I don't feel like it
>>107344522
I love this! Thanks, anon! :D
>>107347945
Good morning, Quokkanon!! :]
Happy Thanksgiving!
>>
"Thanksgiving feast" is a bunch of pumpkins lmao
>>107348033
Happy Thanksgiving PW
>>
>>107347328
flux2? we're all z-image now
i've mostly given on flux2 because it's just unreasonably large. the edit stuff is good, but for just genning you're better off on chroma or z
that said, flux2 does do some nice things (pic related)
>>107347350
>>107347945
>>107348033
happy thanksgiving
>>
File: deNT_zi_00007_.png (3.51 MB, 1920x1152)
3.51 MB
3.51 MB PNG
>>107347945
gm

>>107348033
happy thanksgiving. congrats on the self-assigned day off, lol

>>107348097
>is a bunch of pumpkins
I've gotten that too. I guess in latent space, thanksgiving = pumpkins
>>
File: PW_147481_.png (3.26 MB, 2048x1440)
3.26 MB
3.26 MB PNG
>>107348097
Happy thanksgiving! :D
Hahaha! I put in a bunch of food separately!
>>107348115
Happy Thanksgiving!!!
>>107348160
LOOL sometimes it's nice being the boss hahaha
Happy Thanksgiving!! :D
>>107348215
>Just waiting for a comfyui wf
There's one here!
>https://comfyanonymous.github.io/ComfyUI_examples/z_image/
Just drag the image into comfy :]
>>
>>107348160
Looks like You need to name the food individually so it doesn't come up with a bunch of pumpkins.
>>
we all making themed gens today?
>>
>>
File: z-image, the download.png (287 KB, 1090x647)
287 KB
287 KB PNG
>>107348288
>There's one here!
ty
I found one on gh
>>
>>107348400
huehuehue
you dont need to download alladat, just get the files from comfy (model, encoder, vae)
>>
>>107348450
I figured. We'll I'm already over halfway done. I'll update it later.
>>
File: file.png (37 KB, 342x352)
37 KB
37 KB PNG
>>107348288
>https://comfyanonymous.github.io/ComfyUI_examples/z_image/

Thanks for this link. I've been checking on it now and then through the day and with or without the hyphen it wasn't showing in the results.
>>
File: PW_147484_.png (3.34 MB, 2048x1440)
3.34 MB
3.34 MB PNG
>>107348353
>>107348398
>>107348450
I am!
LOL these are so good!!
>>107348400
Awesome! :]
It works really well!
>>107348523
Any time, anon! :D
It's really fun!
>>
File: same kaiju.jpg (388 KB, 2362x1018)
388 KB
388 KB JPG
I submit another example of low variance in zimage. kaiju turkey produced essentially the same picture 8 times
>>
File: KJ_SEK_CANCAKE4.jpg (881 KB, 3584x4608)
881 KB
881 KB JPG
>>107348398
>>107348353
These are nice. Very positive vibes. Hope everyone enjoys their turkey.
>>
File: deTK_zi_00007_.png (2.57 MB, 1920x1152)
2.57 MB
2.57 MB PNG
>>107348650
happy thanksgiving. nice to see you
how did your sweet potato thing turn out?
>>
File: deAC_zi_00002_.png (2.75 MB, 1920x1152)
2.75 MB
2.75 MB PNG
>>107348292
>Looks like You need to name the food individually so it doesn't come up with a bunch of pumpkins.
it sneaks in pumpkins anyway, lol
>>
>>107348599
>>107348650
thx
debo is once again a monke
>>
>>107348680
Came out really well thanks for asking. Haven eaten dinner yet but it prepared well. Used Ceylon cinnamon instead of cassia for an extra kick of flavor. Good times. You gunna be with family or you have any funky Thanksgiving traditions?
>>
File: deAC_zi_00003_.png (2.33 MB, 1920x1152)
2.33 MB
2.33 MB PNG
>>107348736
debo may have return to monke, but at least he isn't rude enough to start eating before the table is served like SOME people (glares at PW)

>>107348755
that sounds tasty, leave some extra for me xD
>You gunna be with family or you have any funky Thanksgiving traditions?
nah, my family is on the other side of the nation and traveling during the holidays is painful. my tradition is just making sure I have a hearty serving of mashed potatoes and turkey. besides that, lounging :)
>>
>>107348846
>debo stares angrily as pw keeps eating
>>
File: PW_147516_.png (3.14 MB, 2048x1440)
3.14 MB
3.14 MB PNG
>>107348736
>>107348907
LOOL These are so good
I have no idea how to make bunchan haha I wish the coherence thing worked with this model but I tried and it didn't quite work out
>>
>>107348936
>debo still a monke

i used
>a small, young chibi girl wearing a black one piece bathing suit and long black bunny ears, she has a malevolent and evil grin on her face. she is strictly drawn in black and white, unlike everyone else.
>>
File: PW_147519_.png (3.24 MB, 2048x1440)
3.24 MB
3.24 MB PNG
>>107348967
Ohhh!! Thanks i'll try it now! :D
I opted to do the human Koff cause I would get a quokka+monke mix LMAO
>>
>>
File: PW_147527_.png (3.19 MB, 2048x1440)
3.19 MB
3.19 MB PNG
>>107349041
Oh wow that worked immediately haha ty!
It always puts a random person at the end haha
>>
File: sa4.png (1.79 MB, 1280x1440)
1.79 MB
1.79 MB PNG
>Ceylon cinnamon
>>
>>107349056
nice
it helps to tell it
"make a list of [number] of characters:
1.
2.
etc and describe each in detail
of course it still mixes up debo and monke lol
>>
>>107348936
>>107348907
>>107349078
Really love the dynamic size of Quokka on these
>>
File: PW_147533_.png (3.17 MB, 2048x1440)
3.17 MB
3.17 MB PNG
>>107349078
I actually did list em hahaha! I never proompt that way but it worked out really well! Then I just did
>Character 1: blah blah blah
>Character 2: blah blah blah
etc haha
>>
File: deAC_zi_00006_.png (2.53 MB, 1920x1152)
2.53 MB
2.53 MB PNG
>>107349041
this is the funniest bunchan, lmao

>>107349056
>It always puts a random person at the end
the AI is self-inserting because it wants to be part of the fun
>>
who knew debo was the angry emo kid of the family

>>107349116
>the quokka is watching the dinner quokka
lol
yeah with these llm-based text encoders both natural language and programming "style" stuff work
>>107349129
>bun-chan an actual kid
>>
File: PW_147535_.png (3.27 MB, 2048x1440)
3.27 MB
3.27 MB PNG
>>107349103
Hahaha and he's always on the table!
>>107349129
LOL that's gotta be it! It's always the same person too pretty much
>>107349142
LOOL I noticed that too hahaha pretty dark
>>
File: ve3.png (2.35 MB, 1264x1592)
2.35 MB
2.35 MB PNG
>goku evendoe rat
>>
File: PW_147548_.png (3.54 MB, 2048x1440)
3.54 MB
3.54 MB PNG
>>
z now on fp8 scaled
https://huggingface.co/silveroxides/Z-Image-Turbo-SingleFile/tree/main

also happy thanksgiving to the schizos
>>
>>107349314
TWO chromagirls
>>
>>107349315
I'm following you.
>>
File: deAC_zi_00007_.png (2.25 MB, 1920x1152)
2.25 MB
2.25 MB PNG
>the mashed potatoes are begging to be eaten but grandma said "we have to wait for everyone before we can eat"
meanwhile, grandma was always the last one everyone was waiting for. both a saint and satan, every time

>>107347886
this is probably the version you want for your card if the fp16 doesn't fit >>107349315
>>
File: wb6.png (3.59 MB, 1616x2040)
3.59 MB
3.59 MB PNG
>also happy thanksgiving to the schizos
>>
>>107349315
You mentioned using Gemma with ZiT. Did you just generate a prompt with that or did you use some special snowflake node in the actual workflow and if so, how?
>>
>>107349334
that's not me tho lol
>>107347886
here's fp8 scaled
also taef1 vae (speeds up i guess?)
and qwen3-4b-instruct-2507-uncensored-unslop-q8_0.gguf as the TE
>>107349404
yes, that means (you)
although i was speaking of the nogens lol
>>
>>107349421
Oh someone did and I'm going to find out who
>>
File: deAC_zi_00011_.png (2.62 MB, 1920x1152)
2.62 MB
2.62 MB PNG
>>107349404
omg it wojakanon
it took me a few gens to notice
>>
>>107348160
neat
>>
>>107349420
>Gemma with ZiT
wut
i use llama.cpp, through a node on comfy
load gemma3 on llama, use the node to make api calls (locally), feed it the promp;t, it returns the llm prompt
>>107349433
far out
>>
File: mo7.png (1.67 MB, 2024x2272)
1.67 MB
1.67 MB PNG
>I'm following you.
>>
File: PW_147561_.png (2.97 MB, 1840x1440)
2.97 MB
2.97 MB PNG
>>107349327
Hahaha!! I found out how to get rid of the random character! Just made the image size slightly smaller
>>107349404
Happy Thanksgiving! :]
>>
>>107349461
Thank you my dear. Okay then, I wasn't really sure if did remember correctly and I didn't.
>>
File: deAC_zi_00013_.png (2.53 MB, 1920x1152)
2.53 MB
2.53 MB PNG
>everything ghibli
I'm guessing z-image has a lot of gpt images in its training

>>107349461
>load gemma3 on llama,
not sure if he figured it out or not but this anon was asking about gemma >>107347410
>>
>>107349586
technically not a text encoder tho, just a prompt fluffer
gemma3 a good fluffgirl
>>
File: PW_147568_.png (2.96 MB, 1840x1440)
2.96 MB
2.96 MB PNG
>>
>>107349602
>>107349586
yeah i'm familiar with llama.cpp etc
I was looking into this stuff earlier and tried to find out if some custom nodes exist, I found out that there's this one for SDXL though
>https://huggingface.co/Minthy/RouWei-Gemma?not-for-all-audiences=true
but who cares, ZiT is pretty fun so far
>>
>>
>>107348680
happy thanksgiving!
>>
>>107349641
>ZiT
took me a second why you kept saying zit lel
>>
File: Vy6.mp4 (3.12 MB, 768x768)
3.12 MB
3.12 MB MP4
>Alibaba Tongyi Wanxiang
>>
>>
>>
>>
>>107349421
>>107349362
>>107349315
Thanks :)
I'll try it out when I get home
>>
>>107349960
if you go with fp8 scaled dont forget to use the right node:
https://github.com/silveroxides/ComfyUI_Hybrid-Scaled_fp8-Loader
and update if you have it
>>
>14:44:44
>>
>>
File: Vi7.mp4 (468 KB, 1280x720)
468 KB
468 KB MP4
>>107350045
FAIL
>>
File: deAC_zi_00015_.png (2.83 MB, 1920x1152)
2.83 MB
2.83 MB PNG
>>107349696
happy thanksgiving!
I forgot entirely about the wishbone ritual. i wonder who wins (and what chaos is manifest from their wish)

>>107349772
trippy af
>>
>Hes ignoring food anon now
>>
>>
Attack of the Thousand Foot Turkey
https://www.youtube.com/watch?v=TnlfMsPMtHM
https://suno.com/s/WSvBB3TdiYBhefEu
>>
>>
yo bois. I have a big database of nudes so I already have some dataset to work with. so. how can I use these pictures to train a model to create realistic non-reverse image engine searchable pictures for desperate simps that think I am a hot female with them not knowing that I am myself desprate for money since I am broke as shit and an ugly piece of shit irl
>>
File: 2025-11-27 21.54.23.png (840 KB, 1340x887)
840 KB
840 KB PNG
Z-Image Turbo bros how do I get better colours out of it?
>>
>>107350467
use best quality ones (if lora, <200 should do well enough)
caption them with gemini or other llm, full natural language captions
repeat, this time with tags only in a duplicate folder
use a good base to train (suggest chroma or wait for z-image base model to come out). set dataset to [set1] natural language, [set2] tags
set buckets
train 512/768/1024/1152/1536 (whatver your gpu can handle)
3-6k steps should do depending on settings etc
???
profit!

>>107350509
"hdr, vivid colors" in prompt? use more style (photography stuff) too
>>
>>107350509
Try cfg 2 or 3. You probably need to increase steps too by the multiple of ten.
Also specifying 'high contrast' or something helps.
>>
>/ldg/ has discovered llm for prompt enhancement
dark days are upon us
>>
>>107350571
Looks like you're right about cfg. I tried increasing steps and all it does is add fucked up artifacts when I get to ~40. Btw are there better samplers/schedulers for Z-Image? The default example was set to just euler and simple.
>>
>>107350648
There should be an optimal combination of sampler and scheduler. Simple is always the default. Maybe someone has tested it already.
>>
>>107350648
look through this thread
>>107346194
i think the previous one also had several tests by the same anon
when in doubt, just use euler and simple or beta, that's why comfy uses that in their base examples
>>
File: file.png (2.98 MB, 1536x1024)
2.98 MB
2.98 MB PNG
>>107350764
Thanks m8. I tried dpmpp_3m_sde_gpu once and it was a disaster.
>>
>>107350800
you could also get the res4lyf nodes and use deis_2m + beta
it's mostly what i've been using for a while now
>>
>>
>>
It's got better fingers but the penguin isn't on her head.
 a stunningly beautiful japanese fashion idol is standing on a table, surrounded by a crowd, in a night club. She is wearing a slinky dress, showing cleavage. She is standing casually on a table and she is looking up at a penguin standing on her head, balancing two butterflies at each end of a balancing pole it's holding in its peak - like in a circus act. 
The scene is captured by a professional photographer, using professional lighting to perfectly capture the details and her skin tone. He is also documenting her penguin's abilities.
>>
File: deAC_zi_00017_.png (2.43 MB, 1920x1152)
2.43 MB
2.43 MB PNG
>>
>>107351106
push the cfg higher and 30+ steps
>>
this general really went downhill since BMP left
>>
>>
>>107351200
Not bad.
I'll try that.
>>
File: deAC_zi_00034_.png (2.66 MB, 1920x1152)
2.66 MB
2.66 MB PNG
fat and full panda
>>
File: jap.jpg (189 KB, 768x1024)
189 KB
189 KB JPG
>>107351106
Or just get rid of the word salad slop which is useless. I re-edited your prompt and erased redundant entries.

>Candid photography: Single Subject in a Night Club - a female is standing on a table with penguin on her head.
>Japanese fashion idol is standing on a table. She is wearing a slinky dress showing a cleavage. She is also bbalancing two butterflies at each end of a balancing pole,
>>
>>107351462
Redudant entries are confusing the llm. You can't say one thing and then at the end it's another.
It's not about ESL or some /g/ shit but how you construct these things. Don't rely upon your old prompts but write a simple description. Then organize it together.
Don't jump from subject to subject between lines.
>>
stable diffusingly speaking it's like pissing razorblades
>>
File: deAC_zi_00035_.png (2.25 MB, 1920x1152)
2.25 MB
2.25 MB PNG
>>107351490
text encoding don't converge similar concepts?
>>
>>107351546
LLM, in this case Qwen3 4B is somewhat different than the regular encoders. It's not just about keywords. Form a hierarchy and it's better.
I don't know what you background is with local LLMs but I have learned that trying to create simple coherent sentences is easier than word salads.
This is not like SDXL or even like that T5.
Maybe I'm full of shit but I don't think I am.
>>
>>107351541
>im a nigbophile
>>
gonna ask in here too, is there some newer version of https://github.com/KohakuBlueleaf/Kohaku-NAI ? I want a discord bot that lets people use NAI but that one has a pretty old config from before the newest models
>>
>>107351725
Maybe ask /aicg/ or what it is these days.
>>
>>107351801
I want it for images, not text
>>
>>107351872
You need to be willing to walk through the quicksand if you are willing to learn anything.
>>
File: Penguin.mp4 (1.58 MB, 464x688)
1.58 MB
1.58 MB MP4
>>107351490
>>107351462
Ah of course.
>>
>tfw spent 2 hrs fucking with python package dependency hell trying to get something to work only to end up back where i started
>it didnt work
fml
>>
>>107352125
What is your next big film?
>>
on the other hand i didnt break comfy this time
>>
>>
>>107352411
>/sdg/ troons can't into venv's
>>
>>107352411
>>107352533
wise words.

Try trashing your venv and rebuilding it with pythom -m venv venv when in ComfyUI main folder.
>>
>>107352411
>>
>>
>>107352579
nah, the venv is fine, it's just a bunch of side packages that went back and forth trying to find compatible versions. i've given up tho
>>107352600
i did not break the venv
>>
>>107352607
I doesn't matter venv is just a trash directory.
>>
>>107352607
I want to know what are your python directories?
>>
File: deAC_zi_00036_.png (2.25 MB, 1920x1152)
2.25 MB
2.25 MB PNG
>>
>>107352607
>skill issue
you're in the correct containment general
>>
>>107352628
What did you mean? Your family broke you already.
>>
>>107352609
yes, let's keep several venvs and trash them when something doesnt work
that's silly
>>107352618
>>107352628
what
>>
File: deAC_zi_00038_.png (2.38 MB, 1920x1152)
2.38 MB
2.38 MB PNG
>>107352628
>obsessively posts in a general he hates to people he hates, even on holidays
you're so winning
>>
>>107352628
Here is my Thanksgiving song to you:
https://www.youtube.com/watch?v=I_RgZKzF8bU
>>
>all the troons feel attacked
What do you mean by this?
>>
>>107352653
>responding means someone feels attacked
I think you're projecting anon
>>
>heh, I was only ironically smearing myself in shit. you fell for it
>>
>your belly is full
>>
>>107352666
We get it, your family hates you, but this isn't the place to express that
>>
>>107352666
Feel free to slop some visual representation of that out
Bet those images would be better than wordsoup spammer girl gens
>>
File: deAC_zi_00039_.png (2.96 MB, 1920x1152)
2.96 MB
2.96 MB PNG
>>107352672
so true bestie
>>
>>
>>107352686
Thank you dear!
>>
>>107351230
True
>>
>>
File: 00001-3792240816.jpg (1.4 MB, 2048x2560)
1.4 MB
1.4 MB JPG
>>
>>107352772
H-hello? I asked my friends about you and they are all scared
>>
File: 00002-1589955507.jpg (1.21 MB, 2048x2560)
1.21 MB
1.21 MB JPG
hopefully you all had a nice thanksgiving/thursday
gnite
>>
File: deAC_zi_00040_.png (2.83 MB, 1920x1152)
2.83 MB
2.83 MB PNG
>>107352839
you too!
gn
>>
>>107352839
i refused to take my meds
what can they do
but a nice thanksgiving
to you my friend too
>>
What I like about z-image - so far
1. Fast
2. Finger and hands almost perfect
3. Great details
4. Respects your prompts
5. Run on "normal" vram
6. Small faces don't look like monsters
7. Uncensored (as far I know)

What I have challenges with
1. Skin in my gens looks patchy and blurry.
>>
File: deAC_zi_00041_.png (2.75 MB, 1920x1152)
2.75 MB
2.75 MB PNG
>>107352887
>1. Skin in my gens looks patchy and blurry.
try adding the ModelSamplingAuraFlow node and set shift to 7
>>
>>
>>
>>107352901
thanks, I'll test it out.
So far, seems to help.
>>
>>
>>
>>
the biggest complaint i have about z is how it makes everything centered and almost balanced. it's the only type of composition it seems to want to do unless you really force it. it's not really flexible about different angles and positions that are off-center.
>>
https://www.youtube.com/watch?v=UatP53m3_kc
>>
File: deDL_zi_00057_.png (2.49 MB, 2048x1216)
2.49 MB
2.49 MB PNG
>>107353204
yeah, I was struggling against that with this dj prompt. I was trying "to the {right|left}", "uncentered, off-center", etc etc but they're always in the center of the shot
>>
>>107353290
try things like "from the left of the frame" or "off to the side of the image" too. but ultimately you'll have to push it through the samplers/cfg setting to break
>>
>>107353340
gigganigga
>>
esp. when there's too many things going on it'll fall back to "i'll make a centered collag"
>>107353352
lel
>>
File: deDL_zi_00058_.png (2.5 MB, 2048x1216)
2.5 MB
2.5 MB PNG
>>107353377
I wonder if the undistilled base will behave differently. people seem to think that its the distillation causing the rigidness. might just be cope tho
>>
>>107353408
it's overtrained in some aspects, so I'm guessing base may also be. speculation goes both ways, some say base is still getting finetuned, others say the turbo model is just a very early version of the base
>>
>>107353461
I want to ask you a special thing. Can you whip out a jewtube song?
>>
>>107353484
uh ok
https://www.youtube.com/watch?v=6gmswmbosYo
>>
File: deDW_zi_00001_.png (3.87 MB, 1920x1152)
3.87 MB
3.87 MB PNG
>>107353461
I hope this isn't one of those "coming soon!" things where it doesn't, in fact, come soon
>>
>>107353493
2 more weeks bruh
>>
>>107353490
Thanks honey. I will listen to it.
>>
>>107353490
3 seconds and I can't stand it.
Here is my song.
https://www.youtube.com/watch?v=bMFhaGXICWw&list=RDbMFhaGXICWw
>>
Next Thread

>>107353522
>>107353522
>>107352522



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.